NotSoTiny: A Large, Living Benchmark for RTL Code Generation
Razine Moundir Ghorab, Emanuele Parisi, Cristian Gutierrez, Miquel Alberti-Binimelis, Miquel Moreto, Dario Garcia-Gasulla, Gokcen Kestor

TL;DR
NotSoTiny is a comprehensive benchmark for evaluating large language models in generating realistic, complex RTL code, addressing previous limitations of scale, design complexity, and data contamination.
Contribution
This paper introduces NotSoTiny, a large, evolving RTL benchmark built from real hardware designs to better evaluate LLM capabilities in realistic hardware code generation.
Findings
NotSoTiny tasks are more challenging than previous benchmarks.
The benchmark effectively pushes LLMs to handle complex, real-world RTL designs.
Periodic updates help mitigate data contamination issues.
Abstract
LLMs have shown early promise in generating RTL code, yet evaluating their capabilities in realistic setups remains a challenge. So far, RTL benchmarks have been limited in scale, skewed toward trivial designs, offering minimal verification rigor, and remaining vulnerable to data contamination. To overcome these limitations and to push the field forward, this paper introduces NotSoTiny, a benchmark that assesses LLM on the generation of structurally rich and context-aware RTL. Built from hundreds of actual hardware designs produced by the Tiny Tapeout community, our automated pipeline removes duplicates, verifies correctness and periodically incorporates new designs to mitigate contamination, matching Tiny Tapeout release schedule. Evaluation results show that NotSoTiny tasks are more challenging than prior benchmarks, emphasizing its effectiveness in overcoming current limitations of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmbedded Systems Design Techniques · Formal Methods in Verification · Parallel Computing and Optimization Techniques
