TY - GEN
T1 - Characterizing I/O and Storage Activity on the K Computer for Post-Processing Purposes
AU - Inacio, Eduardo C.
AU - Nonaka, Jorji
AU - Ono, Kenji
AU - Dantas, Mario A.R.
AU - Shoji, Fumiyoshi
N1 - Funding Information:
ACKNOWLEDGMENT The authors would like to thank the Brazilian Feeral Agency CAPES for partially supporting this research. Results reported in this paper were obtained using the K computer at RIKEN R-CCS in Kobe, Japan. This work is partially supported by the “Joint Usage/Research Center for Interdisciplinary Large-scale Information Infrastructures” in Japan (Project ID:jh180060-NAH).
Publisher Copyright:
© 2018 IEEE.
PY - 2018/11/15
Y1 - 2018/11/15
N2 - An increasing volume of data is produced by computational science applications executing on flagship-class supercomputers, such as the K computer. Most of these huge datasets would later pass through post-processing for visualization and analysis in order to derive meaningful information. Particular characteristics of the computing environment, application, and the dataset itself, can make efficiently exploring the performance capabilities of large-scale storage systems supporting these supercomputer a challenging task. This paper presents a characterization of the I/O and storage activity of jobs executed on the K computer focusing on post-processing purposes, based upon nine months of production operation recorded. Results demonstrate the intensive data demand of K computer applications, both in terms of volume of file I/O carried out during job execution, amount of data staged-in and staged-out, and number of files produced per job. These aspects shed light on challenges and opportunities for specialized data management libraries for posthoc data visualization and analysis.
AB - An increasing volume of data is produced by computational science applications executing on flagship-class supercomputers, such as the K computer. Most of these huge datasets would later pass through post-processing for visualization and analysis in order to derive meaningful information. Particular characteristics of the computing environment, application, and the dataset itself, can make efficiently exploring the performance capabilities of large-scale storage systems supporting these supercomputer a challenging task. This paper presents a characterization of the I/O and storage activity of jobs executed on the K computer focusing on post-processing purposes, based upon nine months of production operation recorded. Results demonstrate the intensive data demand of K computer applications, both in terms of volume of file I/O carried out during job execution, amount of data staged-in and staged-out, and number of files produced per job. These aspects shed light on challenges and opportunities for specialized data management libraries for posthoc data visualization and analysis.
UR - http://www.scopus.com/inward/record.url?scp=85059217689&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85059217689&partnerID=8YFLogxK
U2 - 10.1109/ISCC.2018.8538488
DO - 10.1109/ISCC.2018.8538488
M3 - Conference contribution
AN - SCOPUS:85059217689
VL - 2018-June
T3 - Proceedings - IEEE Symposium on Computers and Communications
SP - 730
EP - 735
BT - 2018 IEEE Symposium on Computers and Communications, ISCC 2018
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2018 IEEE Symposium on Computers and Communications, ISCC 2018
Y2 - 25 June 2018 through 28 June 2018
ER -