OSIRIS: Bridging Analog Circuit Design and Machine Learning with Scalable Dataset Generation
Giuseppe Chiari, Michele Piccoli, Davide Zoni

TL;DR
OSIRIS introduces a scalable dataset generation pipeline for analog IC design, enabling machine learning applications by providing extensive, high-quality data and a baseline RL-based optimization method.
Contribution
The paper presents OSIRIS, a novel framework for generating large-scale, detailed datasets for analog circuit design, facilitating ML research and development in EDA.
Findings
Generated 87,100 circuit variations with OSIRIS
Provided a reinforcement learning baseline for analog design optimization
Enabled systematic exploration of analog design space
Abstract
The automation of analog integrated circuit (IC) design remains a longstanding challenge, primarily due to the intricate interdependencies among physical layout, parasitic effects, and circuit-level performance. These interactions impose complex constraints that are difficult to accurately capture and optimize using conventional design methodologies. Although recent advances in machine learning (ML) have shown promise in automating specific stages of the analog design flow, the development of holistic, end-to-end frameworks that integrate these stages and iteratively refine layouts using post-layout, parasitic-aware performance feedback is still in its early stages. Furthermore, progress in this direction is hindered by the limited availability of open, high-quality datasets tailored to the analog domain, restricting both the benchmarking and the generalizability of ML-based techniques.…
Peer Reviews
Decision·ICLR 2026 Poster
1. Introduces a dataset-generation pipeline for analog layouts and releases an open-source dataset augmented with post-layout simulations that guarantee the sample is LVS-, DRC-clean. 2. Efficient design-space exploration. Proposes a reinforcement-learning-driven, iterative variant-generation method that enables efficient, performance-aware exploration of the analog layout space.
1. Limited circuit type. The dataset currently covers only amplifier circuits at the 130 nm node. 2. In Table 3, it’s not fair and confusing to compare with the MAGICAL and ALIGN, which are only analog layout generation tools without any design-space exploration. 3. Constrained variant generation and diversity. Variants are created mainly by permuting device fingers and component placement within the halo, which limits structural diversity; some schematics permit fundamentally different layout
- The dataset is substantial, comprising more than 64,200 circuit variations, which could be highly valuable for future research. - Since publicly available back-end analog circuit datasets are rare, this work has the potential to fill an important gap in the field.
- The main issue is that the paper is difficult to read. Given that ICLR is primarily an AI-focused venue, the paper should better explain the fundamental principles of analog back-end design and clearly describe the intended applications of the proposed benchmark. In its current form, it reads more like a technical report than a research paper. - The experimental section is relatively short, even for a benchmark-oriented paper.
1. Systematically generated a full-link dataset covering 4 types of typical amplifiers and 64000+ samples. All samples passed the “Design Rule Check” and “Layout-Circuit Consistency Check”, ensuring the “industrial-grade reliability” of the data. It is the first publicly available, reproducible, and fully annotated large-scale simulation IC layout dataset. 2. The process and methods have strong scalability. The automated processing flow has potential to handle other types of analog circuits and
1. Only covering 4 types of amplifiers and 130nm process, the circuit and process coverage is insufficient. 2. The experimental part of the main text does not fully elaborate on the quality comparison of the data set and the model effect based on it, to prove the lightweighting. Without conducting verification in conjunction with specific simulation IC design tasks, there is a lack of quantitative results demonstrating the improvement in model performance of this dataset in actual tasks. As a
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVLSI and FPGA Design Techniques · Low-power high-performance VLSI design · Physical Unclonable Functions (PUFs) and Hardware Security
