RATE: An LLM-Powered Retrieval Augmented Generation Technology-Extraction Pipeline
Karan Mirhosseini, Arya Aftab, Alireza Sheikh

TL;DR
This paper presents RATE, a novel LLM-powered pipeline that combines retrieval augmented generation and validation to extract technology terms from scientific literature with high accuracy, demonstrated on BCI and XR research articles.
Contribution
The paper introduces RATE, a hybrid LLM-based pipeline that significantly improves technology extraction accuracy over existing methods, with broad applicability to scientific literature analysis.
Findings
RATE achieved an F1-score of 91.27%.
Outperformed BERT-based method with F1-score of 53.73%.
Mapped extracted technologies into a research landscape network.
Abstract
In an era of radical technology transformations, technology maps play a crucial role in enhancing decision making. These maps heavily rely on automated methods of technology extraction. This paper introduces Retrieval Augmented Technology Extraction (RATE), a Large Language Model (LLM) based pipeline for automated technology extraction from scientific literature. RATE combines Retrieval Augmented Generation (RAG) with multi-definition LLM-based validation. This hybrid method results in high recall in candidate generation alongside with high precision in candidate filtering. While the pipeline is designed to be general and widely applicable, we demonstrate its use on 678 research articles focused on Brain-Computer Interfaces (BCIs) and Extended Reality (XR) as a case study. Consequently, The validated technology terms by RATE were mapped into a co-occurrence network, revealing thematic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
