TY - GEN
T1 - Ties between mined structural patterns in program and their identifier names
AU - Mashima, Yoshiki
AU - Hirokawa, Sachio
AU - Takeuchi, Kazuhiro
PY - 2019/1/1
Y1 - 2019/1/1
N2 - Identifier names in readable and maintainable source codes are always descriptive. These names are given based on the implicit knowledge of experienced programmers. In this paper, we propose a structural pattern mining method based on support vector machines (SVM) for source codes. We extract 1,000 method names in object-oriented source codes collected from online software repositories and create 1,000 datasets labeled by positive and negative class. The structural features used for the input feature vectors to the SVM learning are designed for representing partial characteristics in the abstract syntax tree (AST) parsed from a source code. Applying this method, we made an F1 score list of the 1,000 method names, which shows the degree of patterning of each name, by using our structural features. From the list, we confirmed structural patterns were strongly associated with specific method names. A qualitative evaluation of method names was also conducted by mapping the structural feature vector of each program example to the two-dimensional plane in the same way as a previous major study. From the evaluation, we confirmed that the contrasting structure among the programs corresponds to the names given to programs. Furthermore, we show examples of visualization of structural patterns using structural features extracted by feature selection.
AB - Identifier names in readable and maintainable source codes are always descriptive. These names are given based on the implicit knowledge of experienced programmers. In this paper, we propose a structural pattern mining method based on support vector machines (SVM) for source codes. We extract 1,000 method names in object-oriented source codes collected from online software repositories and create 1,000 datasets labeled by positive and negative class. The structural features used for the input feature vectors to the SVM learning are designed for representing partial characteristics in the abstract syntax tree (AST) parsed from a source code. Applying this method, we made an F1 score list of the 1,000 method names, which shows the degree of patterning of each name, by using our structural features. From the list, we confirmed structural patterns were strongly associated with specific method names. A qualitative evaluation of method names was also conducted by mapping the structural feature vector of each program example to the two-dimensional plane in the same way as a previous major study. From the evaluation, we confirmed that the contrasting structure among the programs corresponds to the names given to programs. Furthermore, we show examples of visualization of structural patterns using structural features extracted by feature selection.
UR - http://www.scopus.com/inward/record.url?scp=85064214136&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85064214136&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-14815-7_28
DO - 10.1007/978-3-030-14815-7_28
M3 - Conference contribution
AN - SCOPUS:85064214136
SN - 9783030148140
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 335
EP - 346
BT - Integrated Uncertainty in Knowledge Modelling and Decision Making - 7th International Symposium, IUKM 2019, Proceedings
A2 - Seki, Hirosato
A2 - Inuiguchi, Masahiro
A2 - Nguyen, Canh Hao
A2 - Huynh, Van-Nam
PB - Springer Verlag
T2 - 7th International Symposium on Integrated Uncertainty in Knowledge Modelling and Decision Making, IUKM 2019
Y2 - 27 March 2019 through 29 March 2019
ER -