Pareto-Optimized Open-Source LLMs for Healthcare via Context Retrieval
Jordi Bayarri-Planas, Ashwin Kumar Gururajan, Dario, Garcia-Gasulla

TL;DR
This paper introduces a retrieval-optimized approach for open-source LLMs in healthcare, achieving state-of-the-art medical question answering accuracy cost-effectively and providing open resources to advance healthcare AI development.
Contribution
It presents a novel retrieval-based pipeline, a new benchmark for open-ended medical QA, and open-source tools to improve healthcare AI performance and accessibility.
Findings
Achieved state-of-the-art accuracy on MedQA benchmark
Reduced cost compared to proprietary models
Provided open-source resources for healthcare AI
Abstract
This study leverages optimized context retrieval to enhance open-source Large Language Models (LLMs) for cost-effective, high performance healthcare AI. We demonstrate that this approach achieves state-of-the-art accuracy on medical question answering at a fraction of the cost of proprietary models, significantly improving the cost-accuracy Pareto frontier on the MedQA benchmark. Key contributions include: (1) OpenMedQA, a novel benchmark revealing a performance gap in open-ended medical QA compared to multiple-choice formats; (2) a practical, reproducible pipeline for context retrieval optimization; and (3) open-source resources (Prompt Engine, CoT/ToT/Thinking databases) to empower healthcare AI development. By advancing retrieval techniques and QA evaluation, we enable more affordable and reliable LLM solutions for healthcare.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies
