Bridging the Last Mile of Circuit Design: PostEDA-Bench, a Hierarchical Benchmark for PPA Convergence and DRC Fixing
Pengju Liu, Nuo Xu, Jinwei Tang, Yu Cao, Caiwen Ding

TL;DR
This paper introduces PostEDA-Bench, a hierarchical benchmark for evaluating LLM agents on the last mile of EDA tasks, including DRC fixing and PPA convergence, with diverse tasks and toolchains.
Contribution
It presents a comprehensive, hierarchical benchmark with 145 tasks and machine-checkable evaluation, addressing gaps in existing EDA-LLM benchmarks and enabling detailed performance analysis.
Findings
Agents perform well on synthetic DRC-Essential tasks.
Performance drops significantly on practical DRC-Reasoning and PPA-Multi tasks.
Vision augmentation improves DRC performance, and trade-off reasoning is a key bottleneck.
Abstract
LLM-based agents are increasingly applied to the "last mile" of Electronic Design Automation (EDA): repairing residual sign-off Design Rule Check (DRC) violations and converging Power-Performance-Area (PPA) targets after tool runs. Existing EDA-LLM benchmarks, however, omit DRC fixing entirely and rely on flat hierarchies tied to a single toolchain. We introduce PostEDA-Bench, a hierarchical benchmark with 145 tasks across DRC-Essential, DRC-Reasoning, PPA-Mono, and PPA-Multi, supported by EDA toolchains with machine-checkable evaluation. Across eight commercial and open-source LLMs under multiple agent scaffolds, we find that agents handle synthetic DRC-Essential and single-objective PPA-Mono reasonably well but degrade sharply on the more practical DRC-Reasoning, where the best success rate is 36.66%, and PPA-Multi, where the best success rate is 20.00%; vision augmentation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
