QiMeng-CodeV-R1: Reasoning-Enhanced Verilog Generation
Yaoyu Zhu, Di Huang, Hanqi Lyu, Xiaoyun Zhang, Chongxiao Li, Wenxuan Shi, Yutong Wu, Jianan Mu, Jinghua Wang, Yang Zhao, Pengwei Jin, Shuyao Cheng, Shengwen Liang, Xishan Zhang, Rui Zhang, Zidong Du, Qi Guo, Xing Hu, Yunji Chen

TL;DR
This paper introduces CodeV-R1, a reinforcement learning framework for training large language models to generate Verilog hardware descriptions from natural language, overcoming verification and data scarcity challenges, and achieving state-of-the-art performance.
Contribution
The paper presents a novel RLVR framework with a rule-based testbench, a data synthesis method, and a two-stage training pipeline, significantly improving Verilog generation accuracy.
Findings
Achieves 68.6% pass@1 on VerilogEval v2
Surpasses prior state-of-the-art by 12-20%
Outperforms larger models like DeepSeek-R1 on RTLLM
Abstract
Large language models (LLMs) trained via reinforcement learning with verifiable reward (RLVR) have achieved breakthroughs on tasks with explicit, automatable verification, such as software programming and mathematical problems. Extending RLVR to electronic design automation (EDA), especially automatically generating hardware description languages (HDLs) like Verilog from natural-language (NL) specifications, however, poses three key challenges: the lack of automated and accurate verification environments, the scarcity of high-quality NL-code pairs, and the prohibitive computation cost of RLVR. To this end, we introduce CodeV-R1, an RLVR framework for training Verilog generation LLMs. First, we develop a rule-based testbench generator that performs robust equivalence checking against golden references. Second, we propose a round-trip data synthesis method that pairs open-source Verilog…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Software Engineering Research
MethodsDialogue-Adaptive Pre-training Objective
