RepoAudit: An Autonomous LLM-Agent for Repository-Level Code Auditing
Jinyao Guo, Chengpeng Wang, Xiangzhe Xu, Zian Su, Xiangyu Zhang

TL;DR
RepoAudit is an autonomous LLM-based agent that efficiently performs repository-level code auditing by analyzing data-flow within functions, reducing hallucinations, and identifying bugs with high precision in real-world projects.
Contribution
This work introduces RepoAudit, the first autonomous LLM agent for repository-level code auditing that combines data-flow analysis, hallucination mitigation, and scalable bug detection.
Findings
Detected 40 true bugs with 78.43% precision in benchmark projects.
Identified 185 new bugs in high-profile repositories, with 174 confirmed or fixed.
Achieved efficient auditing requiring only 0.44 hours and $2.54 per project.
Abstract
Code auditing is the process of reviewing code with the aim of identifying bugs. Large Language Models (LLMs) have demonstrated promising capabilities for this task without requiring compilation, while also supporting user-friendly customization. However, auditing a code repository with LLMs poses significant challenges: limited context windows and hallucinations can degrade the quality of bug reports, and analyzing large-scale repositories incurs substantial time and token costs, hindering efficiency and scalability. This work introduces an LLM-based agent, RepoAudit, designed to perform autonomous repository-level code auditing. Equipped with agent memory, RepoAudit explores the codebase on demand by analyzing data-flow facts along feasible program paths within individual functions. It further incorporates a validator module to mitigate hallucinations by verifying data-flow facts…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsDigital Rights Management and Security · Service-Oriented Architecture and Web Services · Advanced Computational Techniques and Applications
