PubMed Reasoner: Dynamic Reasoning-based Retrieval for Evidence-Grounded Biomedical Question Answering

Yiqing Zhang; Xiaozhong Liu; Fabricio Murai

arXiv:2603.27335·cs.CL·March 31, 2026

PubMed Reasoner: Dynamic Reasoning-based Retrieval for Evidence-Grounded Biomedical Question Answering

Yiqing Zhang, Xiaozhong Liu, Fabricio Murai

PDF

TL;DR

PubMed Reasoner is a multi-stage biomedical QA system that iteratively refines queries, retrieves evidence, and generates answers with citations, achieving high accuracy and clinical relevance.

Contribution

It introduces a novel three-stage reasoning framework combining query refinement, reflective retrieval, and evidence-grounded answer generation for biomedical QA.

Findings

01

Achieves 78.32% accuracy on PubMedQA, surpassing human experts.

02

Shows consistent improvements on MMLU Clinical Knowledge.

03

LLM-as-judge evaluations favor our system's responses.

Abstract

Trustworthy biomedical question answering (QA) systems must not only provide accurate answers but also justify them with current, verifiable evidence. Retrieval-augmented approaches partially address this gap but lack mechanisms to iteratively refine poor queries, whereas self-reflection methods kick in only after full retrieval is completed. In this context, we introduce PubMed Reasoner, a biomedical QA agent composed of three stages: self-critic query refinement evaluates MeSH terms for coverage, alignment, and redundancy to enhance PubMed queries based on partial (metadata) retrieval; reflective retrieval processes articles in batches until sufficient evidence is gathered; and evidence-grounded response generation produces answers with explicit citations. PubMed Reasoner with a GPT-4o backbone achieves 78.32% accuracy on PubMedQA, slightly surpassing human experts, and showing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.