TL;DR
ChipSeek introduces a reinforcement learning framework that guides large language models to generate RTL code optimized for both correctness and hardware efficiency, leveraging EDA feedback.
Contribution
It presents a hierarchical reward-based RL approach with curriculum-guided optimization to improve RTL generation quality and PPA metrics.
Findings
Achieves state-of-the-art functional correctness and PPA performance.
Excels in fine-grained optimization tasks like power, delay, and area.
Outperforms existing methods in standard benchmarks.
Abstract
Large Language Models have emerged as powerful tools for automating Register-Transfer Level (RTL) code generation, yet they face critical limitations: existing approaches typically fail to simultaneously optimize functional correctness and hardware efficiency metrics such as Power, Performance, and Area (PPA). Methods relying on supervised fine-tuning commonly produce functionally correct but suboptimal designs due to the lack of inherent mechanisms for learning hardware optimization principles. Conversely, external post-processing techniques aiming to refine PPA performance after generation often suffer from inefficiency and do not improve the LLMs' intrinsic capabilities. To overcome these challenges, we propose ChipSeek, a novel hierarchical reward based reinforcement learning framework designed to encourage LLMs to generate RTL code that is both functionally correct and optimized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
