VERIRL: Boosting the LLM-based Verilog Code Generation via Reinforcement Learning

Fu Teng; Miao Pan; Xuhong Zhang; Zhezhi He; Yiyao Yang; Xinyi Chai; Mengnan Qi; Liqiang Lu; Jianwei Yin

arXiv:2508.18462·cs.LG·August 27, 2025

VERIRL: Boosting the LLM-based Verilog Code Generation via Reinforcement Learning

Fu Teng, Miao Pan, Xuhong Zhang, Zhezhi He, Yiyao Yang, Xinyi Chai, Mengnan Qi, Liqiang Lu, Jianwei Yin

PDF

TL;DR

This paper introduces VERIRL, a reinforcement learning framework for Verilog code generation, leveraging a new dataset and novel reward refinement techniques to achieve state-of-the-art results in hardware description language synthesis.

Contribution

The paper presents a tailored RL approach for Verilog, including a new dataset, a Trace-back Rescore mechanism, and a sample-balanced weighting strategy, advancing hardware code generation.

Findings

01

Achieved state-of-the-art performance in Verilog code generation

02

Demonstrated improved test pass rate and functional correctness

03

Outperformed existing methods like CraftRTL and DeepSeek-style approaches

Abstract

Recent advancements in code generation have shown remarkable success across software domains, yet hardware description languages (HDLs) such as Verilog remain underexplored due to their concurrency semantics, syntactic rigidity, and simulation complexity. In this work, we address these challenges by introducing a reinforcement learning (RL) framework tailored for Verilog code generation. We first construct Veribench-53K, a high-quality dataset curated from over 700K Verilog problems, enriched with structured prompts, complexity labels, and diverse testbenches. To tackle the problem of sparse and noisy reward signals, we propose a Trace-back based Rescore mechanism that leverages reasoning paths and iterative refinement to enhance feedback reliability and support reward model training. Furthermore, to mitigate catastrophic forgetting and overfitting during RL fine-tuning, we introduce a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.