TY - GEN
T1 - Energy Efficient Runahead Execution on a Tightly Coupled Heterogeneous Core
AU - Mashimo, Susumu
AU - Shioya, Ryota
AU - Inoue, Koji
N1 - Funding Information:
This work was partly supported by JSPS KAKENHI Grant Number JP19H01105, JP16H05855, and JST-Mirai Program Grant Number JP18077278.
Publisher Copyright:
© 2020 ACM.
PY - 2020/1/15
Y1 - 2020/1/15
N2 - Out-of-order (OoO) processors generally offer significant performance gains over simpler in-order (InO) processors. However, recent studies have revealed that OoO processors provide little performance benefit in many program phases, and these phases are distributed in fine granularity. Leveraging these fine-grained phases, tightly coupled heterogeneous cores (TCHCs) have been proposed to improve the energy efficiency. A TCHC, which is a processor core that consists of multiple back-ends, each with different characteristics in terms of their performance and energy consumption (e.g., a power-efficient InO back-end and a high-performance OoO back-end), improves the energy efficiency by executing programs by switching to the most energy-efficient back-end with a very small switching penalty. We propose a novel technique to further improve the energy efficiency of a TCHC. The proposed technique is based on runahead execution (RAE), which is a prefetch technique that executes instructions ahead of long-latency cache misses and issues independent cache misses earlier. Leveraging the characteristics of TCHCs and RAE, the proposed technique increases the utilization of energy-efficient back-ends, thereby significantly improving the energy efficiency. Our evaluation results show that our proposed method achieves 13% of energy-delay product (EDP) over a state-of-the-art TCHC using Oracle switching decision logic.
AB - Out-of-order (OoO) processors generally offer significant performance gains over simpler in-order (InO) processors. However, recent studies have revealed that OoO processors provide little performance benefit in many program phases, and these phases are distributed in fine granularity. Leveraging these fine-grained phases, tightly coupled heterogeneous cores (TCHCs) have been proposed to improve the energy efficiency. A TCHC, which is a processor core that consists of multiple back-ends, each with different characteristics in terms of their performance and energy consumption (e.g., a power-efficient InO back-end and a high-performance OoO back-end), improves the energy efficiency by executing programs by switching to the most energy-efficient back-end with a very small switching penalty. We propose a novel technique to further improve the energy efficiency of a TCHC. The proposed technique is based on runahead execution (RAE), which is a prefetch technique that executes instructions ahead of long-latency cache misses and issues independent cache misses earlier. Leveraging the characteristics of TCHCs and RAE, the proposed technique increases the utilization of energy-efficient back-ends, thereby significantly improving the energy efficiency. Our evaluation results show that our proposed method achieves 13% of energy-delay product (EDP) over a state-of-the-art TCHC using Oracle switching decision logic.
UR - http://www.scopus.com/inward/record.url?scp=85094832107&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85094832107&partnerID=8YFLogxK
U2 - 10.1145/3368474.3368496
DO - 10.1145/3368474.3368496
M3 - Conference contribution
AN - SCOPUS:85094832107
T3 - ACM International Conference Proceeding Series
SP - 207
EP - 216
BT - Proceedings of International Conference on High Performance Computing in Asia-Pacific Region, HPC Asia 2020
PB - Association for Computing Machinery
T2 - 2020 International Conference on High Performance Computing in Asia-Pacific Region, HPC Asia 2020
Y2 - 15 January 2020 through 17 January 2020
ER -