Benchmarking Retrieval Strategies for Biomedical Retrieval-Augmented Generation: A Controlled Empirical Study

Devi Prasad Bal; Subhashree Puhan

arXiv:2605.02520·cs.CL·May 5, 2026

Benchmarking Retrieval Strategies for Biomedical Retrieval-Augmented Generation: A Controlled Empirical Study

Devi Prasad Bal, Subhashree Puhan

PDF

TL;DR

This study systematically compares five retrieval strategies in biomedical RAG pipelines, demonstrating that query-document interaction improves retrieval quality and that retrieval significantly enhances answer relevancy.

Contribution

It provides a controlled empirical evaluation of retrieval strategies in biomedical RAG, highlighting the effectiveness of cross-encoder reranking and the impact of retrieval on answer relevancy.

Findings

01

Cross-encoder reranking achieves the highest composite score and contextual precision.

02

Naive multi-query expansion introduces retrieval noise and reduces precision.

03

All retrieval strategies outperform no-context baselines in answer relevancy.

Abstract

Retrieval-Augmented Generation (RAG) offers a well-established path to grounding large language model (LLM) outputs in external knowledge, yet the question of which retrieval strategy works best in a high-stakes domain such as biomedicine has not received the controlled, multi-metric treatment it deserves. This paper presents a systematic empirical comparison of five retrieval strategies -- Dense Vector Search, Hybrid BM25 + Dense retrieval, Cross-Encoder Reranking, Multi-Query Expansion, and Maximal Marginal Relevance (MMR) -- within a biomedical question-answering RAG pipeline. All strategies share a fixed generation model (GPT-4o-mini), a common vector store (ChromaDB), and OpenAI's text-embedding-3-small embeddings, ensuring that observed differences are attributable to retrieval alone. Evaluation is conducted on 250 question-answer pairs drawn from a preprocessed subset of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.