How and Why Agents Can Identify Bug-Introducing Commits
Niklas Risse, Marcel B\"ohme

TL;DR
This paper demonstrates that LLM-based agents significantly improve bug-introducing commit identification by deriving effective search patterns, surpassing previous state-of-the-art methods on Linux kernel data.
Contribution
It introduces a simple agentic workflow leveraging LLMs to search candidate commits, achieving a substantial F1-score increase from 0.64 to 0.81.
Findings
F1-score improved from 0.64 to 0.81 using LLM-based agents.
Agents derive short, greppable patterns from commit diffs and messages.
Insights may enable advances in bug detection, root cause analysis, and repair.
Abstract
\'Sliwerski, Zimmermann, and Zeller (SZZ) just won the 2026 ACM SIGSOFT Impact Award for asking: When do changes induce fixes? Their paper from 2005 served as the foundation for a wide array of approaches aimed at identifying bug-introducing changes (or commits) from fix commits in software repositories. But even after two decades of progress, the best-performing approach from 2025 yields a modest increase of 10 percentage points in F1-score on the most popular Linux kernel dataset. In this paper, we uncover how and why LLM-based agents can substantially advance the state-of-the-art in identifying bug-introducing commits from fix commits. We propose a simple agentic workflow based on searching a set of candidate commits and find that it raises the F1-score from 0.64 to 0.81 on the most popular Linux kernel dataset, a bigger jump than between the original 2005 method (0.54) and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
