BugGen: A Self-Correcting Multi-Agent LLM Pipeline for Realistic RTL Bug Synthesis

Surya Jasper; Minh Luu; Evan Pan; Aakash Tyagi; Michael Quinn; Jiang Hu; David Kebo Houngninou

arXiv:2506.10501·cs.SE·June 19, 2025

BugGen: A Self-Correcting Multi-Agent LLM Pipeline for Realistic RTL Bug Synthesis

Surya Jasper, Minh Luu, Evan Pan, Aakash Tyagi, Michael Quinn, Jiang Hu, David Kebo Houngninou

PDF

Open Access

TL;DR

BugGen is an autonomous multi-agent pipeline that uses LLMs to generate, insert, and validate realistic RTL bugs, significantly improving dataset diversity, bug detection, and verification efficiency for hardware design validation.

Contribution

It introduces BugGen, the first fully autonomous multi-agent system leveraging LLMs for systematic RTL bug generation, validation, and dataset creation.

Findings

01

Generated 500 unique bugs with 94% functional accuracy.

02

Achieved a throughput of 17.7 bugs per hour, over five times faster than manual methods.

03

Improved ML failure triage accuracy using BugGen-generated datasets.

Abstract

Hardware complexity continues to strain verification resources, motivating the adoption of machine learning (ML) methods to improve debug efficiency. However, ML-assisted debugging critically depends on diverse and scalable bug datasets, which existing manual or automated bug insertion methods fail to reliably produce. We introduce BugGen, a first of its kind, fully autonomous, multi-agent pipeline leveraging Large Language Models (LLMs) to systematically generate, insert, and validate realistic functional bugs in RTL. BugGen partitions modules, selects mutation targets via a closed-loop agentic architecture, and employs iterative refinement and rollback mechanisms to ensure syntactic correctness and functional detectability. Evaluated across five OpenTitan IP blocks, BugGen produced 500 unique bugs with 94% functional accuracy and achieved a throughput of 17.7 validated bugs per…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Testing and Debugging Techniques · Machine Learning and Data Classification · Software Engineering Research