Reformulate, Retrieve, Localize: Agents for Repository-Level Bug Localization

Genevieve Caumartin; Glaucia Melo

arXiv:2512.07022·cs.SE·December 9, 2025

Reformulate, Retrieve, Localize: Agents for Repository-Level Bug Localization

Genevieve Caumartin, Glaucia Melo

PDF

Open Access

TL;DR

This paper introduces an LLM-powered agent that reformulates bug reports to improve file-level bug localization accuracy, achieving significant performance gains over traditional methods.

Contribution

It demonstrates how lightweight query reformulation with LLMs enhances bug localization effectiveness at scale, a novel application in this domain.

Findings

01

35% better first-file retrieval ranking

02

Up to +22% improvement over SWE-agent

03

Effective use of non-fine-tuned LLMs for query reformulation

Abstract

Bug localization remains a critical yet time-consuming challenge in large-scale software repositories. Traditional information retrieval-based bug localization (IRBL) methods rely on unchanged bug descriptions, which often contain noisy information, leading to poor retrieval accuracy. Recent advances in large language models (LLMs) have improved bug localization through query reformulation, yet the effect on agent performance remains unexplored. In this study, we investigate how an LLM-powered agent can improve file-level bug localization via lightweight query reformulation and summarization. We first employ an open-source, non-fine-tuned LLM to extract key information from bug reports, such as identifiers and code snippets, and reformulate queries pre-retrieval. Our agent then orchestrates BM25 retrieval using these preprocessed queries, automating localization workflow at scale. Using…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Software Testing and Debugging Techniques · Topic Modeling