Wrong Code, Right Structure: Learning Netlist Representations from Imperfect LLM-Generated RTL

Siyang Cai; Cangyuan Li; Yinhe Han; Ying Wang

arXiv:2603.09161·cs.LG·March 11, 2026

Wrong Code, Right Structure: Learning Netlist Representations from Imperfect LLM-Generated RTL

Siyang Cai, Cangyuan Li, Yinhe Han, Ying Wang

PDF

Open Access

TL;DR

This paper introduces a cost-effective framework that leverages imperfect LLM-generated RTL to learn netlist representations, enabling scalable circuit analysis despite limited high-quality labeled data.

Contribution

It presents a novel data augmentation approach using imperfect LLM-generated RTL to train netlist models, overcoming data scarcity in circuit representation learning.

Findings

01

Models trained on synthetic data perform well on real netlists.

02

The approach scales from operator-level to IP-level tasks.

03

Synthetic data can match or outperform high-quality labeled data.

Abstract

Learning effective netlist representations is fundamentally constrained by the scarcity of labeled datasets, as real designs are protected by Intellectual Property (IP) and costly to annotate. Existing work therefore focuses on small-scale circuits with clean labels, limiting scalability to realistic designs. Meanwhile, Large Language Models (LLMs) can generate Register-Transfer-Level (RTL) at scale, but their functional incorrectness has hindered their use in circuit analysis. In this work, we make a key observation: even when LLM-Generated RTL is functionally imperfect, the synthesized netlists still preserve structural patterns that are strongly indicative of the intended functionality. Building on this insight, we propose a cost-effective data augmentation and training framework that systematically exploits imperfect LLM-Generated RTL as training data for netlist representation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Materials Science · Physical Unclonable Functions (PUFs) and Hardware Security · Advanced Graph Neural Networks