Neural at ArchEHR-QA 2025: Agentic Prompt Optimization for Evidence-Grounded Clinical Question Answering

Sai Prasanna Teja Reddy Bogireddy; Abrar Majeedi; Viswanatha Reddy Gajjala; Zhuoyan Xu; Siddhant Rai; and Vaishnav Potlapalli

arXiv:2506.10751·cs.LG·November 10, 2025

Neural at ArchEHR-QA 2025: Agentic Prompt Optimization for Evidence-Grounded Clinical Question Answering

Sai Prasanna Teja Reddy Bogireddy, Abrar Majeedi, Viswanatha Reddy Gajjala, Zhuoyan Xu, Siddhant Rai, and Vaishnav Potlapalli

PDF

1 Video

TL;DR

This paper introduces Neural, a prompt optimization approach for evidence-grounded clinical question answering over electronic health records, achieving high accuracy without model fine-tuning.

Contribution

It proposes a decoupled evidence identification and answer synthesis method with automated prompt tuning, improving clinical QA performance efficiently.

Findings

01

Achieved 51.5 overall score, second place in BioNLP 2025 ArchEHR-QA.

02

Outperformed zero-shot and few-shot prompting by over 20 and 10 points.

03

Demonstrated data-driven prompt optimization as a cost-effective alternative to fine-tuning.

Abstract

Automated question answering (QA) over electronic health records (EHRs) can bridge critical information gaps for clinicians and patients, yet it demands both precise evidence retrieval and faithful answer generation under limited supervision. In this work, we present Neural, the runner-up in the BioNLP 2025 ArchEHR-QA shared task on evidence-grounded clinical QA. Our proposed method decouples the task into (1) sentence-level evidence identification and (2) answer synthesis with explicit citations. For each stage, we automatically explore the prompt space with DSPy's MIPROv2 optimizer, jointly tuning instructions and few-shot demonstrations on the development set. A self-consistency voting scheme further improves evidence recall without sacrificing precision. On the hidden test set, our method attains an overall score of 51.5, placing second stage while outperforming standard zero-shot…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Neural at ArchEHR-QA 2025: Agentic Prompt Optimization for Evidence-Grounded Clinical Question Answering· underline