TY - GEN
T1 - A lightweight and portable approach to making concurrent failures reproducible
AU - Luo, Qingzhou
AU - Zhang, Sai
AU - Zhao, Jianjun
AU - Hu, Min
N1 - Copyright:
Copyright 2010 Elsevier B.V., All rights reserved.
PY - 2010
Y1 - 2010
N2 - Concurrent programs often exhibit bugs due to unintended interferences among the concurrent threads. Such bugs are often hard to reproduce because they typically happen under very specific interleaving of the executing threads. Basically, it is very hard to fix a bug (or software failure) in concurrent programs without being able to reproduce it. In this paper, we present an approach, called ConCrash, that automatically and deterministically reproduces concurrent failures by recording logical thread schedule and generating unit tests. For a given bug (failure), ConCrash records the logical thread scheduling order and preserves object states in memory at runtime. Then, ConCrash reproduces the failure offline by simply using the saved information without the need for JVM-level or OS-level support. To reduce the runtime performance overhead, ConCrash employs a static data race detection technique to report potential possible race conditions, and only instruments such places. We implement the ConCrash approach in a prototype tool for Java and experimented on a number of multi-threaded Java benchmarks. As a result, we successfully reproduced a number of real concurrent bugs (e.g., deadlocks, data races and atomicity violation) within an acceptable overhead.
AB - Concurrent programs often exhibit bugs due to unintended interferences among the concurrent threads. Such bugs are often hard to reproduce because they typically happen under very specific interleaving of the executing threads. Basically, it is very hard to fix a bug (or software failure) in concurrent programs without being able to reproduce it. In this paper, we present an approach, called ConCrash, that automatically and deterministically reproduces concurrent failures by recording logical thread schedule and generating unit tests. For a given bug (failure), ConCrash records the logical thread scheduling order and preserves object states in memory at runtime. Then, ConCrash reproduces the failure offline by simply using the saved information without the need for JVM-level or OS-level support. To reduce the runtime performance overhead, ConCrash employs a static data race detection technique to report potential possible race conditions, and only instruments such places. We implement the ConCrash approach in a prototype tool for Java and experimented on a number of multi-threaded Java benchmarks. As a result, we successfully reproduced a number of real concurrent bugs (e.g., deadlocks, data races and atomicity violation) within an acceptable overhead.
UR - http://www.scopus.com/inward/record.url?scp=77951292783&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77951292783&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-12029-9_23
DO - 10.1007/978-3-642-12029-9_23
M3 - Conference contribution
AN - SCOPUS:77951292783
SN - 3642120288
SN - 9783642120282
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 323
EP - 337
BT - Fundamental Approaches to Software Engineering - 13th International Conference, FASE 2010, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2010, Proceedings
T2 - 13th International Conference on Fundamental Approaches to Software Engineering, FASE 2010, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2010
Y2 - 20 March 2010 through 28 March 2010
ER -