BiomedRAG: A Retrieval Augmented Large Language Model for Biomedicine
Mingchen Li, Halil Kilicoglu, Hua Xu, Rui Zhang

TL;DR
BiomedRAG introduces a simple retrieval-augmented approach for biomedical NLP tasks, directly inputting retrieved documents into LLMs to improve accuracy and reduce hallucinations across multiple datasets.
Contribution
It proposes a straightforward method for retrieval-augmented LLMs in biomedicine that bypasses complex mechanisms and enables LLM supervision of retrieval models, enhancing performance.
Findings
Achieves superior performance on 5 biomedical NLP tasks.
Outperforms existing triple extraction systems with high micro-F1 scores.
Effectively reduces noise in retrieved documents during tasks.
Abstract
Large Language Models (LLMs) have swiftly emerged as vital resources for different applications in the biomedical and healthcare domains; however, these models encounter issues such as generating inaccurate information or hallucinations. Retrieval-augmented generation provided a solution for these models to update knowledge and enhance their performance. In contrast to previous retrieval-augmented LMs, which utilize specialized cross-attention mechanisms to help LLM encode retrieved text, BiomedRAG adopts a simpler approach by directly inputting the retrieved chunk-based documents into the LLM. This straightforward design is easily applicable to existing retrieval and language models, effectively bypassing noise information in retrieved documents, particularly in noise-intensive tasks. Moreover, we demonstrate the potential for utilizing the LLM to supervise the retrieval model in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Topic Modeling
