TY - GEN
T1 - From Local to Global Semantic Clone Detection
AU - Yuan, Yuan
AU - Kong, Weiqiang
AU - Hou, Gang
AU - Hu, Yan
AU - Watanabe, Masahiko
AU - Fukuda, Akira
N1 - Funding Information:
This research is supported by National Natural Science Foundation of China (Grant No. 61572097) and by the Fundamental Research Funds for the Central Universities (Grant No. DUT18JC08).
Publisher Copyright:
© 2020 IEEE.
PY - 2020/1
Y1 - 2020/1
N2 - Clone detection detects similar code fragments (refer to as clone code) in software products. It can help with software optimization and maintenance. Code clone detection can be divided into textual, lexical, syntactic and semantic levels. The existing technologies have achieved many good results in the first three levels, but no significant results have been obtained in semantic clone detection. In this paper, we propose a novel semantic level clone detection approach. We use the control flow graph (CFG) as an intermediate representation of the program method, combining the classical dynamic time warping (DTW) algorithm in the field of speech recognition with two deep neural network models (bidirectional RNN autoencoder and graph convolutional network (GCN)) to detect semantic level clone from local to global. We experimented with a dataset consisting of five large-scale real-world systems and a code corpus containing a large number of programming problems. The experimental results show that our approach can achieve good results in detecting both local and global semantic clone.
AB - Clone detection detects similar code fragments (refer to as clone code) in software products. It can help with software optimization and maintenance. Code clone detection can be divided into textual, lexical, syntactic and semantic levels. The existing technologies have achieved many good results in the first three levels, but no significant results have been obtained in semantic clone detection. In this paper, we propose a novel semantic level clone detection approach. We use the control flow graph (CFG) as an intermediate representation of the program method, combining the classical dynamic time warping (DTW) algorithm in the field of speech recognition with two deep neural network models (bidirectional RNN autoencoder and graph convolutional network (GCN)) to detect semantic level clone from local to global. We experimented with a dataset consisting of five large-scale real-world systems and a code corpus containing a large number of programming problems. The experimental results show that our approach can achieve good results in detecting both local and global semantic clone.
UR - http://www.scopus.com/inward/record.url?scp=85085469870&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85085469870&partnerID=8YFLogxK
U2 - 10.1109/DSA.2019.00012
DO - 10.1109/DSA.2019.00012
M3 - Conference contribution
AN - SCOPUS:85085469870
T3 - Proceedings - 2019 6th International Conference on Dependable Systems and Their Applications, DSA 2019
SP - 13
EP - 24
BT - Proceedings - 2019 6th International Conference on Dependable Systems and Their Applications, DSA 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 6th International Conference on Dependable Systems and Their Applications, DSA 2019
Y2 - 3 January 2020 through 6 January 2020
ER -