Exploring the Agentic Frontier of Verilog Code Generation
Patrick Yubeaton, Siddharth Garg, Chinmay Hegde

TL;DR
This paper systematically evaluates how agentic frameworks influence Verilog code generation by large language models, revealing that structured harnesses can improve performance over naive approaches.
Contribution
It introduces open-source hardware design agent harnesses and provides a comprehensive analysis of agentic LLMs for Verilog generation using the CVDP benchmark.
Findings
Naive agentic wrapping can degrade performance compared to optimized prompts.
Structured harnesses can match or surpass non-agentic baselines.
Open-source models have higher crash rates and weaker tool output interpretation.
Abstract
Large language models (LLMs) have made rapid advancements in code generation for popular languages such as Python and C++. Many of these recent gains can be attributed to the use of ``agents'' that wrap domain-relevant tools alongside LLMs. Hardware design languages such as Verilog have also seen improved code generation in recent years, but the impact of agentic frameworks on Verilog code generation tasks remains unclear. In this work, we present the first systematic evaluation of agentic LLMs for Verilog generation, using the recently introduced CVDP benchmark. We also introduce several open-source hardware design agent harnesses, providing a model-agnostic baseline for future work. Through controlled experiments across frontier models, we study how structured prompting and tool design affect performance, analyze agent failure modes and tool usage patterns, compare open-source and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
