Iceberg: Enhancing HLS Modeling with Synthetic Data
Zijian Ding, Tung Nguyen, Weikai Li, Aditya Grover, Yizhou Sun, Jason Cong

TL;DR
Iceberg introduces a synthetic data augmentation method for HLS prediction models, significantly improving their generalization and optimization performance across various hardware design applications.
Contribution
The paper proposes Iceberg, a novel synthetic data augmentation approach that enhances HLS modeling accuracy and adaptability using LLM-generated programs and weak labels.
Findings
86.4% improvement in modeling accuracy with few-shot adaptation
2.47x better offline DSE performance on test datasets
Effective generalization to six real-world applications
Abstract
Deep learning-based prediction models for High-Level Synthesis (HLS) of hardware designs often struggle to generalize. In this paper, we study how to close the generalizability gap of these models through pretraining on synthetic data and introduce Iceberg, a synthetic data augmentation approach that expands both large language model (LLM)-generated programs and weak labels of unseen design configurations. Our weak label generation method is integrated with an in-context model architecture, enabling meta-learning from actual and proximate labels. Iceberg improves the geometric mean modeling accuracy by when adapt to six real-world applications with few-shot examples and achieves a and a better offline DSE performance when adapting to two different test datasets. Our open-sourced code is here: https://github.com/UCLA-VAST/iceberg
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems
