UTSA-NLP at ArchEHR-QA 2025: Improving EHR Question Answering via Self-Consistency Prompting

Sara Shields-Menard; Zach Reimers; Joshua Gardner; David Perry; Anthony Rios

arXiv:2506.05589·cs.CL·June 9, 2025

UTSA-NLP at ArchEHR-QA 2025: Improving EHR Question Answering via Self-Consistency Prompting

Sara Shields-Menard, Zach Reimers, Joshua Gardner, David Perry, Anthony Rios

PDF

Open Access 1 Video

TL;DR

This paper presents a system that improves clinical question answering from electronic health records by using large language models with self-consistency prompting, enhancing sentence relevance detection and response accuracy.

Contribution

The authors introduce a novel approach combining few-shot prompting, self-consistency, and thresholding to improve EHR question answering, demonstrating that smaller models can outperform larger ones in this task.

Findings

01

Smaller 8B models outperform larger 70B models in sentence relevance detection.

02

Self-consistency and thresholding improve the reliability of sentence classification.

03

Accurate sentence selection is crucial for high-quality EHR question answering.

Abstract

We describe our system for the ArchEHR-QA Shared Task on answering clinical questions using electronic health records (EHRs). Our approach uses large language models in two steps: first, to find sentences in the EHR relevant to a clinician's question, and second, to generate a short, citation-supported response based on those sentences. We use few-shot prompting, self-consistency, and thresholding to improve the sentence classification step to decide which sentences are essential. We compare several models and find that a smaller 8B model performs better than a larger 70B model for identifying relevant information. Our results show that accurate sentence selection is critical for generating high-quality responses and that self-consistency with thresholding helps make these decisions more reliable.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

UTSA-NLP at ArchEHR-QA 2025: Improving EHR Question Answering via Self-Consistency Prompting· underline

Taxonomy

TopicsTopic Modeling · Machine Learning in Healthcare · Artificial Intelligence in Healthcare and Education