Retrieval-Augmented Large Language Models for Schema-Constrained Clinical Information Extraction
A H M Rezaul Karim, Ozlem Uzuner

TL;DR
This paper introduces a retrieval-augmented generation pipeline utilizing large language models to extract and normalize clinical observations from nurse-patient transcripts, achieving high accuracy with schema constraints and auditing.
Contribution
It presents a novel modular RAG approach with schema-constrained prompting, postprocessing, and auditing, tailored for clinical information extraction from conversational transcripts.
Findings
RAG improves extraction performance across models.
GPT-5.2 with full schema and auditing achieves 80.36% F1.
Second-pass auditing provides modest accuracy improvements.
Abstract
Conversational nurse-patient transcripts contain actionable observations, but converting these transcripts into structured representations at scale remains challenging. Documentation burden is substantial, with prior studies showing clinicians spend large portions of their workday on documentation and related desk work rather than direct patient care. MEDIQA-SYNUR focuses on observation extraction from conversational nurse-patient transcripts, requiring systems to normalize these narratives into a predefined schema with value-type constraints. We propose a modular retrieval-augmented generation (RAG) pipeline that uses the training set as an exemplar corpus, combines schema-constrained prompting (full schema vs. pruned candidate schema), deterministic schema-based postprocessing, and a second-pass audit, with two LLM backbones: Llama-4-Scout-17B-16E-Instruct and GPT-5.2 with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
