Retrieval Augmented Generation Evaluation for Health Documents
Mario Ceresa, Lorenzo Bertolini, Valentin Comte, Nicholas Spadaro,, Barbara Raffael, Brigitte Toussaint, Sergio Consoli, Amalia Mu\~noz, Pi\~neiro, Alex Patak, Maddalena Querci, Tobias Wiesenthal

TL;DR
This paper evaluates Retrieval Augmented Generation (RAG) methods for processing healthcare documents, introducing a new pipeline, evaluation tools, and a benchmark dataset to improve accuracy and trustworthiness in medical information synthesis.
Contribution
It presents RAGEv, a novel pipeline with evaluation tools and a benchmark dataset for assessing RAG approaches in health document analysis, highlighting their potential and challenges.
Findings
High accuracy in short yes/no and long answers
Careful implementation reduces common LLM issues in healthcare
Potential for integration into policy support tasks
Abstract
Safe and trustworthy use of Large Language Models (LLM) in the processing of healthcare documents and scientific papers could substantially help clinicians, scientists and policymakers in overcoming information overload and focusing on the most relevant information at a given moment. Retrieval Augmented Generation (RAG) is a promising method to leverage the potential of LLMs while enhancing the accuracy of their outcomes. This report assesses the potentials and shortcomings of such approaches in the automatic knowledge synthesis of different types of documents in the health domain. To this end, it describes: (1) an internally developed proof of concept pipeline that employs state-of-the-art practices to deliver safe and trustable analysis for healthcare documents and scientific papers called RAGEv (Retrieval Augmented Generation Evaluation); (2) a set of evaluation tools for LLM-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Artificial Intelligence in Healthcare and Education
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Warmup With Linear Decay · Dropout · Layer Normalization · Byte Pair Encoding · Attention Dropout · Softmax · Residual Connection · WordPiece
