Retrieval Augmented Generation using Engineering Design Knowledge
L. Siddharth, Jianxi Luo

TL;DR
This paper introduces a method to extract explicit engineering design facts from patent descriptions, creating a large knowledge base that enhances retrieval-augmented generation for technical design tasks.
Contribution
It develops a dataset and fine-tunes language models for relation extraction, enabling large-scale knowledge base creation to improve design-related language model responses.
Findings
Token classification achieves 99.7% accuracy
Knowledge base contains over 2.93 million facts
Enhanced LLM responses with explicit engineering knowledge
Abstract
Aiming to support Retrieval Augmented Generation (RAG) in the design process, we present a method to identify explicit, engineering design facts - {head entity :: relationship :: tail entity} from patented artefact descriptions. Given a sentence with a pair of entities (based on noun phrases) marked in a unique manner, our method extracts the relationship that is explicitly communicated in the sentence. For this task, we create a dataset of 375,084 examples and fine-tune language models for relation identification (token classification) and elicitation (sequence-to-sequence). The token classification approach achieves up to 99.7 % accuracy. Upon applying the method to a domain of 4,870 fan system patents, we populate a knowledge base of over 2.93 million facts. Using this knowledge base, we demonstrate how Large Language Models (LLMs) are guided by explicit facts to synthesise knowledge…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOpen Education and E-Learning
MethodsAttention Is All You Need · Linear Layer · Multi-Head Attention · Attention Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · Residual Connection · Adam · Dropout · Layer Normalization · Dense Connections
