TY - JOUR
T1 - Studying re-opened bugs in open source software
AU - Shihab, Emad
AU - Ihara, Akinori
AU - Kamei, Yasutaka
AU - Ibrahim, Walid M.
AU - Ohira, Masao
AU - Adams, Bram
AU - Hassan, Ahmed E.
AU - Matsumoto, Ken Ichi
N1 - Funding Information:
Acknowledgements This research is being conducted as a part of the Next Generation IT Program and Grant-in-aid for Young Scientists (B), 22700033, 2010 by the Ministry of Education, Culture, Sports, Science and Technology, Japan. In addition, it is supported in part by research grants from the Natural Science and Engineering Research Council of Canada.
PY - 2013/10
Y1 - 2013/10
N2 - Bug fixing accounts for a large amount of the software maintenance resources. Generally, bugs are reported, fixed, verified and closed. However, in some cases bugs have to be re-opened. Re-opened bugs increase maintenance costs, degrade the overall user-perceived quality of the software and lead to unnecessary rework by busy practitioners. In this paper, we study and predict re-opened bugs through a case study on three large open source projects - namely Eclipse, Apache and OpenOffice. We structure our study along four dimensions: (1) the work habits dimension (e.g., the weekday on which the bug was initially closed), (2) the bug report dimension (e.g., the component in which the bug was found) (3) the bug fix dimension (e.g., the amount of time it took to perform the initial fix) and (4) the team dimension (e.g., the experience of the bug fixer). We build decision trees using the aforementioned factors that aim to predict re-opened bugs. We perform top node analysis to determine which factors are the most important indicators of whether or not a bug will be re-opened. Our study shows that the comment text and last status of the bug when it is initially closed are the most important factors related to whether or not a bug will be re-opened. Using a combination of these dimensions, we can build explainable prediction models that can achieve a precision between 52.1-78.6 % and a recall in the range of 70.5-94.1 % when predicting whether a bug will be re-opened. We find that the factors that best indicate which bugs might be re-opened vary based on the project. The comment text is the most important factor for the Eclipse and OpenOffice projects, while the last status is the most important one for Apache. These factors should be closely examined in order to reduce maintenance cost due to re-opened bugs.
AB - Bug fixing accounts for a large amount of the software maintenance resources. Generally, bugs are reported, fixed, verified and closed. However, in some cases bugs have to be re-opened. Re-opened bugs increase maintenance costs, degrade the overall user-perceived quality of the software and lead to unnecessary rework by busy practitioners. In this paper, we study and predict re-opened bugs through a case study on three large open source projects - namely Eclipse, Apache and OpenOffice. We structure our study along four dimensions: (1) the work habits dimension (e.g., the weekday on which the bug was initially closed), (2) the bug report dimension (e.g., the component in which the bug was found) (3) the bug fix dimension (e.g., the amount of time it took to perform the initial fix) and (4) the team dimension (e.g., the experience of the bug fixer). We build decision trees using the aforementioned factors that aim to predict re-opened bugs. We perform top node analysis to determine which factors are the most important indicators of whether or not a bug will be re-opened. Our study shows that the comment text and last status of the bug when it is initially closed are the most important factors related to whether or not a bug will be re-opened. Using a combination of these dimensions, we can build explainable prediction models that can achieve a precision between 52.1-78.6 % and a recall in the range of 70.5-94.1 % when predicting whether a bug will be re-opened. We find that the factors that best indicate which bugs might be re-opened vary based on the project. The comment text is the most important factor for the Eclipse and OpenOffice projects, while the last status is the most important one for Apache. These factors should be closely examined in order to reduce maintenance cost due to re-opened bugs.
UR - http://www.scopus.com/inward/record.url?scp=84881192641&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84881192641&partnerID=8YFLogxK
U2 - 10.1007/s10664-012-9228-6
DO - 10.1007/s10664-012-9228-6
M3 - Article
AN - SCOPUS:84881192641
SN - 1382-3256
VL - 18
SP - 1005
EP - 1042
JO - Empirical Software Engineering
JF - Empirical Software Engineering
IS - 5
ER -