ChipExpert: The Open-Source Integrated-Circuit-Design-Specific Large Language Model
Ning Xu, Zhaoyang Zhang, Lei Qi, Wensuo Wang, Chao Zhang, Zihao Ren,, Huaiyuan Zhang, Xin Cheng, Yanqi Zhang, Zhichao Liu, Qingwen Wei, Shiyang Wu,, Lanlan Yang, Qianfeng Lu, Yiqun Ma, Mengyao Zhao, Junbo Liu, Yufan Song, Xin, Geng, Jun Yang

TL;DR
ChipExpert is an open-source, specialized large language model tailored for integrated circuit design, trained with custom datasets and enhanced with retrieval-augmented generation to improve expertise and reduce hallucinations.
Contribution
The paper introduces ChipExpert, the first open-source LLM specifically designed for IC design, including its training process, alignment, and a new benchmark for evaluation.
Findings
High performance on IC design question-answering tasks
Effective reduction of hallucinations via retrieval-augmented generation
Demonstrates expertise across multiple IC design sub-domains
Abstract
The field of integrated circuit (IC) design is highly specialized, presenting significant barriers to entry and research and development challenges. Although large language models (LLMs) have achieved remarkable success in various domains, existing LLMs often fail to meet the specific needs of students, engineers, and researchers. Consequently, the potential of LLMs in the IC design domain remains largely unexplored. To address these issues, we introduce ChipExpert, the first open-source, instructional LLM specifically tailored for the IC design field. ChipExpert is trained on one of the current best open-source base model (Llama-3 8B). The entire training process encompasses several key stages, including data preparation, continue pre-training, instruction-guided supervised fine-tuning, preference alignment, and evaluation. In the data preparation stage, we construct multiple…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Model-Driven Software Engineering Techniques · Scientific Computing and Data Management
MethodsBalanced Selection
