Assessing Large Language Models in Generating RTL Design Specifications
Hung-Ming Huang, Yu-Hsin Yang, Fu-Chieh Chang, Yun-Chia Hsu, Yin-Yu Lin, Ming-Fang Tsai, Chun-Chih Yang, Pei-Yuan Wu

TL;DR
This paper explores how large language models can be used to automate the generation of RTL design specifications, proposing new evaluation metrics and benchmarking various models to improve IC design workflows.
Contribution
It introduces novel prompting strategies and metrics for evaluating RTL specifications generated by LLMs, filling a gap in automated IC design documentation.
Findings
Prompting strategies significantly impact specification quality
New metrics enable reliable evaluation of generated specs
Benchmark results highlight strengths and weaknesses of different LLMs
Abstract
As IC design grows more complex, automating comprehension and documentation of RTL code has become increasingly important. Engineers currently should manually interpret existing RTL code and write specifications, a slow and error-prone process. Although LLMs have been studied for generating RTL from specifications, automated specification generation remains underexplored, largely due to the lack of reliable evaluation methods. To address this gap, we investigate how prompting strategies affect RTL-to-specification quality and introduce metrics for faithfully evaluating generated specs. We also benchmark open-source and commercial LLMs, providing a foundation for more automated and efficient specification workflows in IC design.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmbedded Systems Design Techniques · Formal Methods in Verification · Parallel Computing and Optimization Techniques
