ChipNeMo: Domain-Adapted LLMs for Chip Design
Mingjie Liu, Teodor-Dumitru Ene, Robert Kirby, Chris Cheng, Nathaniel, Pinckney, Rongjian Liang, Jonah Alben, Himyanshu Anand, Sanmitra Banerjee,, Ismet Bayraktaroglu, Bonita Bhaskaran, Bryan Catanzaro, Arjun Chaudhuri,, Sharon Clay, Bill Dally, Laura Dang, Parikshit Deshpande

TL;DR
ChipNeMo demonstrates that domain-adaptive training of large language models significantly improves their performance in chip design tasks, surpassing general models like GPT-4 in specific applications.
Contribution
The paper introduces domain adaptation techniques for LLMs tailored to chip design, achieving superior task performance without sacrificing general capabilities.
Findings
ChipNeMo-70B outperforms GPT-4 in engineering chatbot and script generation.
Domain-adaptive pretraining enhances LLM performance in chip design tasks.
Models maintain generic capabilities despite domain specialization.
Abstract
ChipNeMo aims to explore the applications of large language models (LLMs) for industrial chip design. Instead of directly deploying off-the-shelf commercial or open-source LLMs, we instead adopt the following domain adaptation techniques: domain-adaptive tokenization, domain-adaptive continued pretraining, model alignment with domain-specific instructions, and domain-adapted retrieval models. We evaluate these methods on three selected LLM applications for chip design: an engineering assistant chatbot, EDA script generation, and bug summarization and analysis. Our evaluations demonstrate that domain-adaptive pretraining of language models, can lead to superior performance in domain related downstream tasks compared to their base LLaMA2 counterparts, without degradations in generic capabilities. In particular, our largest model, ChipNeMo-70B, outperforms the highly capable GPT-4 on two…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Rights Management and Security
MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Dropout · Softmax · Dense Connections · Label Smoothing · Adam · Absolute Position Encodings · Residual Connection
