Back to the Future! Studying Data Cleanness in Defects4J and its Impact on Fault Localization
Md Nakhla Rafi, An Ran Chen, Tse-Hsun Chen, Shaohua Wang

TL;DR
This study investigates the impact of developer knowledge embedded in tests within the Defects4J dataset on fault localization effectiveness, revealing significant performance degradation when such knowledge is absent.
Contribution
It analyzes the timeline and modifications of fault-triggering tests in Defects4J, highlighting the influence of developer knowledge on SBFL techniques and providing a dataset for unbiased evaluations.
Findings
55% of fault-triggering tests added to replicate bugs
22% of tests modified after bug reports, containing developer knowledge
SBFL performance drops up to 415% without developer insights
Abstract
For software testing research, Defects4J stands out as the primary benchmark dataset, offering a controlled environment to study real bugs from prominent open-source systems. However, prior research indicates that Defects4J might include tests added post-bug report, embedding developer knowledge and affecting fault localization efficacy. In this paper, we examine Defects4J's fault-triggering tests, emphasizing the implications of developer knowledge of SBFL techniques. We study the timelines of changes made to these tests concerning bug report creation. Then, we study the effectiveness of SBFL techniques without developer knowledge in the tests. We found that 1) 55% of the fault-triggering tests were newly added to replicate the bug or to test for regression; 2) 22% of the fault-triggering tests were modified after the bug reports were created, containing developer knowledge of the bug;…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Testing and Debugging Techniques · Software Reliability and Analysis Research
