AgentSZZ: Teaching the LLM Agent to Play Detective with Bug-Inducing Commits

Yunbo Lyu; Jieke Shi; Hong Jin Kang; Ratnadira Widyasari; Junda He; Yuqing Niu; Chengran Yang; Junkai Chen; Zhou Yang; Julia Lawall; David Lo

arXiv:2604.02665·cs.SE·April 6, 2026

AgentSZZ: Teaching the LLM Agent to Play Detective with Bug-Inducing Commits

Yunbo Lyu, Jieke Shi, Hong Jin Kang, Ratnadira Widyasari, Junda He, Yuqing Niu, Chengran Yang, Junkai Chen, Zhou Yang, Julia Lawall, David Lo

PDF

TL;DR

AgentSZZ introduces an LLM-driven agent framework with interactive reasoning and domain tools to improve bug-inducing commit identification, outperforming existing methods especially in complex scenarios.

Contribution

It presents a novel agent-based approach that integrates tools and domain knowledge with iterative reasoning to enhance bug traceability in software repositories.

Findings

01

AgentSZZ achieves up to 27.2% higher F1-score than prior LLM-based approaches.

02

Recall improvements of up to 300% in challenging cross-file and ghost commit scenarios.

03

Structured compression reduces token usage by over 30% without significant accuracy loss.

Abstract

The SZZ algorithm is the dominant technique for identifying bug-inducing commits and underpins many software engineering tasks, such as defect prediction and vulnerability analysis. Despite numerous variants, including recent LLM-based approaches, performance remains limited on developer-annotated datasets (e.g., recall of 0.552 on the Linux kernel). A key limitation is the reliance on git blame, which traces line-level changes within the same file, failing in common scenarios such as ghost and cross-file cases-making nearly one-quarter of bug-inducing commits inherently untraceable. Moreover, current approaches follow fixed pipelines that restrict iterative reasoning and exploration, unlike developers who investigate bugs through an interactive, multi-tool process. To address these challenges, we propose AgentSZZ, an agent-based framework that leverages LLM-driven agents to explore…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.