EDA-Schema-V2: A Multimodal Schema, Open Datasets, and Benchmarks for Machine Learning in Digital Physical Design
Pratik Shrestha, Alec Aversa, Ioannis Savidis

TL;DR
This paper introduces EDA-Schema-V2, a comprehensive multimodal dataset schema and open datasets for ML in digital physical design, enabling standardized benchmarking and reproducible research.
Contribution
It provides a new structured schema, generates extensive open datasets from multiple design stages, and establishes benchmarks for ML applications in electronic design automation.
Findings
Generated datasets include over 275 million gates and 75 million nets.
Stage-resolved representations enable analysis of predictability across design stages.
Baseline analyses support reproducibility and comparison of ML methods.
Abstract
The continuous scaling of CMOS technology has significantly increased the complexity of very large-scale integrated circuits, driving interest in applying machine learning (ML) to electronic design automation (EDA). However, the limited availability of open and standardized datasets limits interoperability, comparability, and reproducibility in ML-based research. This paper introduces EDA-Schema-V2, an open multimodal schema that provides a structured framework for representing and analyzing datasets in digital physical design. The schema includes representations of physical attributes and quality-of-results metrics across multiple stages of the design flow, including logic synthesis, floorplanning, placement, clock network synthesis, and routing. Utilizing the SkyWater 130nm, Nangate 45nm, IHP SG13G2 130nm, and ASAP 7nm open-source process design kits with the OpenROAD tool flow,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
