TY - GEN
T1 - Power-Efficient Breadth-First Search with DRAM Row Buffer Locality-Aware Address Mapping
AU - Imamura, Satoshi
AU - Yasui, Yuichiro
AU - Inoue, Koji
AU - Ono, Takatsugu
AU - Sasaki, Hiroshi
AU - Fujisawa, Katsuki
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2017/1/23
Y1 - 2017/1/23
N2 - Graph analysis applications have been widely used in real services such as road-traffic analysis and social network services. Breadth-first search (BFS) is one of the most representative algorithms for such applications; therefore, many researchers have tuned it to maximize performance. On the other hand, owing to the strict power constraints of modern HPC systems, it is necessary to improve power efficiency (i.e., performance per watt) when executing BFS. In this work, we focus on the power efficiency of DRAM and investigate the memory access pattern of a state-of-the-art BFS implementation using a cycle-accurate processor simulator. The results reveal that the conventional address mapping schemes of modern memory controllers do not efficiently exploit row buffers in DRAM. Thus, we propose a new scheme called per-row channel interleaving and improve the DRAM power efficiency by 30.3% compared to a conventional scheme for a certain simulator setting. Moreover, we demonstrate that this proposed scheme is effective for various configurations of memory controllers.
AB - Graph analysis applications have been widely used in real services such as road-traffic analysis and social network services. Breadth-first search (BFS) is one of the most representative algorithms for such applications; therefore, many researchers have tuned it to maximize performance. On the other hand, owing to the strict power constraints of modern HPC systems, it is necessary to improve power efficiency (i.e., performance per watt) when executing BFS. In this work, we focus on the power efficiency of DRAM and investigate the memory access pattern of a state-of-the-art BFS implementation using a cycle-accurate processor simulator. The results reveal that the conventional address mapping schemes of modern memory controllers do not efficiently exploit row buffers in DRAM. Thus, we propose a new scheme called per-row channel interleaving and improve the DRAM power efficiency by 30.3% compared to a conventional scheme for a certain simulator setting. Moreover, we demonstrate that this proposed scheme is effective for various configurations of memory controllers.
UR - http://www.scopus.com/inward/record.url?scp=85013899496&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85013899496&partnerID=8YFLogxK
U2 - 10.1109/HPGDMP.2016.010
DO - 10.1109/HPGDMP.2016.010
M3 - Conference contribution
AN - SCOPUS:85013899496
T3 - Proceedings of HPGDMP 2016: High Performance Graph Data Management and Processing - Held in conjunction with SC 2016: The International Conference for High Performance Computing, Networking, Storage and Analysis
SP - 17
EP - 24
BT - Proceedings of HPGDMP 2016
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2016 High Performance Graph Data Management and Processing, HPGDMP 2016
Y2 - 13 November 2016
ER -