ChipNeMo: Domain-Adapted LLMs for Chip Design

Mingjie Liu; Teodor-Dumitru Ene; Robert Kirby; Chris Cheng; Nathaniel; Pinckney; Rongjian Liang; Jonah Alben; Himyanshu Anand; Sanmitra Banerjee,; Ismet Bayraktaroglu; Bonita Bhaskaran; Bryan Catanzaro; Arjun Chaudhuri,; Sharon Clay; Bill Dally; Laura Dang; Parikshit Deshpande; Siddhanth Dhodhi,; Sameer Halepete; Eric Hill; Jiashang Hu; Sumit Jain; Ankit Jindal; Brucek; Khailany; George Kokai; Kishor Kunal; Xiaowei Li; Charley Lind; Hao Liu,; Stuart Oberman; Sujeet Omar; Ghasem Pasandi; Sreedhar Pratty; Jonathan; Raiman; Ambar Sarkar; Zhengjiang Shao; Hanfei Sun; Pratik P Suthar; Varun; Tej; Walker Turner; Kaizhe Xu; Haoxing Ren

arXiv:2311.00176·cs.CL·April 8, 2024·23 cites

ChipNeMo: Domain-Adapted LLMs for Chip Design

Mingjie Liu, Teodor-Dumitru Ene, Robert Kirby, Chris Cheng, Nathaniel, Pinckney, Rongjian Liang, Jonah Alben, Himyanshu Anand, Sanmitra Banerjee,, Ismet Bayraktaroglu, Bonita Bhaskaran, Bryan Catanzaro, Arjun Chaudhuri,, Sharon Clay, Bill Dally, Laura Dang, Parikshit Deshpande

PDF

Open Access

TL;DR

ChipNeMo demonstrates that domain-adaptive training of large language models significantly improves their performance in chip design tasks, surpassing general models like GPT-4 in specific applications.

Contribution

The paper introduces domain adaptation techniques for LLMs tailored to chip design, achieving superior task performance without sacrificing general capabilities.

Findings

01

ChipNeMo-70B outperforms GPT-4 in engineering chatbot and script generation.

02

Domain-adaptive pretraining enhances LLM performance in chip design tasks.

03

Models maintain generic capabilities despite domain specialization.

Abstract

ChipNeMo aims to explore the applications of large language models (LLMs) for industrial chip design. Instead of directly deploying off-the-shelf commercial or open-source LLMs, we instead adopt the following domain adaptation techniques: domain-adaptive tokenization, domain-adaptive continued pretraining, model alignment with domain-specific instructions, and domain-adapted retrieval models. We evaluate these methods on three selected LLM applications for chip design: an engineering assistant chatbot, EDA script generation, and bug summarization and analysis. Our evaluations demonstrate that domain-adaptive pretraining of language models, can lead to superior performance in domain related downstream tasks compared to their base LLaMA2 counterparts, without degradations in generic capabilities. In particular, our largest model, ChipNeMo-70B, outperforms the highly capable GPT-4 on two…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Rights Management and Security

MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Dropout · Softmax · Dense Connections · Label Smoothing · Adam · Absolute Position Encodings · Residual Connection