The Forgotten Role of Search Queries in IR-based Bug Localization: An   Empirical Study

Mohammad Masudur Rahman; Foutse Khomh; Shamima Yeasmin and; Chanchal K. Roy

arXiv:2108.05341·cs.SE·September 1, 2021

The Forgotten Role of Search Queries in IR-based Bug Localization: An Empirical Study

Mohammad Masudur Rahman, Foutse Khomh, Shamima Yeasmin and, Chanchal K. Roy

PDF

1 Repo

TL;DR

This empirical study investigates the effectiveness of query selection practices in IR-based bug localization, revealing limitations of current methods and demonstrating significant performance improvements through optimized query construction.

Contribution

The paper critically examines state-of-the-art query selection approaches and introduces actionable insights that enhance bug localization performance by 27-34%.

Findings

01

Current query selection methods are insufficient for natural language-only bug reports.

02

Optimal queries differ significantly from non-optimal ones in keyword characteristics.

03

Applying insights to non-optimal queries improves localization performance by up to 34%.

Abstract

Being light-weight and cost-effective, IR-based approaches for bug localization have shown promise in finding software bugs. However, the accuracy of these approaches heavily depends on their used bug reports. A significant number of bug reports contain only plain natural language texts. According to existing studies, IR-based approaches cannot perform well when they use these bug reports as search queries. On the other hand, there is a piece of recent evidence that suggests that even these natural language-only reports contain enough good keywords that could help localize the bugs successfully. On one hand, these findings suggest that natural language-only bug reports might be a sufficient source for good query keywords. On the other hand, they cast serious doubt on the query selection practices in the IR-based bug localization. In this article, we attempted to clear the sky on this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

masud-technope/emse-2019-replication-package
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.