TL;DR
This empirical study investigates the effectiveness of query selection practices in IR-based bug localization, revealing limitations of current methods and demonstrating significant performance improvements through optimized query construction.
Contribution
The paper critically examines state-of-the-art query selection approaches and introduces actionable insights that enhance bug localization performance by 27-34%.
Findings
Current query selection methods are insufficient for natural language-only bug reports.
Optimal queries differ significantly from non-optimal ones in keyword characteristics.
Applying insights to non-optimal queries improves localization performance by up to 34%.
Abstract
Being light-weight and cost-effective, IR-based approaches for bug localization have shown promise in finding software bugs. However, the accuracy of these approaches heavily depends on their used bug reports. A significant number of bug reports contain only plain natural language texts. According to existing studies, IR-based approaches cannot perform well when they use these bug reports as search queries. On the other hand, there is a piece of recent evidence that suggests that even these natural language-only reports contain enough good keywords that could help localize the bugs successfully. On one hand, these findings suggest that natural language-only bug reports might be a sufficient source for good query keywords. On the other hand, they cast serious doubt on the query selection practices in the IR-based bug localization. In this article, we attempted to clear the sky on this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
