TY - JOUR
T1 - Code cloning in smart contracts
T2 - a case study on verified contracts from the Ethereum blockchain platform
AU - Kondo, Masanari
AU - Oliva, Gustavo A.
AU - Jiang, Zhen Ming
AU - Hassan, Ahmed E.
AU - Mizuno, Osamu
N1 - Funding Information:
This research has been supported by the Natural Sciences and Engineering Research Council (NSERC), as well as JSPS KAKENHI Japan (Grant Numbers: JP16K12415 and JP19J23477). This study leveraged the computational resources provided by the Microsoft Azure for Research program.
Funding Information:
This research has been supported by the Natural Sciences and Engineering Research Council (NSERC), as well as JSPS KAKENHI Japan (Grant Numbers: JP16K12415 and JP19J23477). This study leveraged the computational resources provided by the Microsoft Azure for Research program.
Publisher Copyright:
© 2020, Springer Science+Business Media, LLC, part of Springer Nature.
PY - 2020/11/1
Y1 - 2020/11/1
N2 - Ethereum is a blockchain platform that hosts and executes smart contracts. Smart contracts have been used to implement cryptocurrencies and crowdfunding initiatives (ICOs). A major concern in Ethereum is the security of smart contracts. Different from traditional software development, smart contracts are immutable once deployed. Hence, vulnerabilities and bugs in smart contracts can lead to catastrophic financial loses. In order to avoid taking the risk of writing buggy code, smart contract developers are encouraged to reuse pieces of code from reputable sources (e.g., OpenZeppelin). In this paper, we study code cloning in Ethereum. Our goal is to quantify the amount of clones in Ethereum (RQ1), understand key characteristics of clone clusters (RQ2), and determine whether smart contracts contain pieces of code that are identical to those published by OpenZeppelin (RQ3). We applied Deckard, a tree-based clone detector, to all Ethereum contracts for which the source code was available. We observe that developers frequently clone contracts. In particular, 79.2% of the studied contracts are clones and we note an upward trend in the number of cloned contracts per quarter. With regards to the characteristics of clone clusters, we observe that: (i) 9 out of the top-10 largest clone clusters are token managers, (ii) most of the activity of a cluster tends to be concentrated on a few contracts, and (iii) contracts in a cluster to be created by several authors. Finally, we note that the studied contracts have different ratios of code blocks that are identical to those provided by the OpenZeppelin project. Due to the immutability of smart contracts, as well as the impossibility of reverting transactions once they are deemed final, we conclude that the aforementioned findings yield implications to the security, development, and usage of smart contracts.
AB - Ethereum is a blockchain platform that hosts and executes smart contracts. Smart contracts have been used to implement cryptocurrencies and crowdfunding initiatives (ICOs). A major concern in Ethereum is the security of smart contracts. Different from traditional software development, smart contracts are immutable once deployed. Hence, vulnerabilities and bugs in smart contracts can lead to catastrophic financial loses. In order to avoid taking the risk of writing buggy code, smart contract developers are encouraged to reuse pieces of code from reputable sources (e.g., OpenZeppelin). In this paper, we study code cloning in Ethereum. Our goal is to quantify the amount of clones in Ethereum (RQ1), understand key characteristics of clone clusters (RQ2), and determine whether smart contracts contain pieces of code that are identical to those published by OpenZeppelin (RQ3). We applied Deckard, a tree-based clone detector, to all Ethereum contracts for which the source code was available. We observe that developers frequently clone contracts. In particular, 79.2% of the studied contracts are clones and we note an upward trend in the number of cloned contracts per quarter. With regards to the characteristics of clone clusters, we observe that: (i) 9 out of the top-10 largest clone clusters are token managers, (ii) most of the activity of a cluster tends to be concentrated on a few contracts, and (iii) contracts in a cluster to be created by several authors. Finally, we note that the studied contracts have different ratios of code blocks that are identical to those provided by the OpenZeppelin project. Due to the immutability of smart contracts, as well as the impossibility of reverting transactions once they are deemed final, we conclude that the aforementioned findings yield implications to the security, development, and usage of smart contracts.
UR - http://www.scopus.com/inward/record.url?scp=85090455143&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85090455143&partnerID=8YFLogxK
U2 - 10.1007/s10664-020-09852-5
DO - 10.1007/s10664-020-09852-5
M3 - Article
AN - SCOPUS:85090455143
SN - 1382-3256
VL - 25
SP - 4617
EP - 4675
JO - Empirical Software Engineering
JF - Empirical Software Engineering
IS - 6
ER -