Customized Retrieval Augmented Generation and Benchmarking for EDA Tool Documentation QA
Yuan Pu, Zhuolun He, Tairu Qiu, Haoyuan Wu, Bei Yu

TL;DR
This paper introduces a domain-specific retrieval augmented generation framework for electronic design automation documentation QA, along with a new benchmark and techniques that significantly improve accuracy over existing methods.
Contribution
It presents a customized RAG framework with three domain-specific techniques and releases a new benchmark for EDA documentation QA, addressing challenges of general-purpose models in knowledge-intensive domains.
Findings
Superior performance on ORD-QA benchmark
Effective domain-specific text embedding and reranking techniques
Enhanced accuracy on commercial EDA tools
Abstract
Retrieval augmented generation (RAG) enhances the accuracy and reliability of generative AI models by sourcing factual information from external databases, which is extensively employed in document-grounded question-answering (QA) tasks. Off-the-shelf RAG flows are well pretrained on general-purpose documents, yet they encounter significant challenges when being applied to knowledge-intensive vertical domains, such as electronic design automation (EDA). This paper addresses such issue by proposing a customized RAG framework along with three domain-specific techniques for EDA tool documentation QA, including a contrastive learning scheme for text embedding model fine-tuning, a reranker distilled from proprietary LLM, and a generative LLM fine-tuned with high-quality domain corpus. Furthermore, we have developed and released a documentation QA evaluation benchmark, ORD-QA, for OpenROAD,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Advanced Database Systems and Queries · Web Data Mining and Analysis
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Linear Warmup With Linear Decay · Multi-Head Attention · Weight Decay · Residual Connection · Dropout · WordPiece · Attention Dropout
