A Multi-Stage Validation Framework for Trustworthy Large-scale Clinical Information Extraction using Large Language Models
Maria Mahbub, Gregory M. Dams, Josh Arnold, Caitlin Rizy, Sudarshan Srinivasan, Elliot M. Fielstein, Minu A. Aghevli, Kamonica L. Craig, Elizabeth M. Oliva, Joseph Erdos, Jodie Trafton, and Ioana Danciu

TL;DR
This paper introduces a comprehensive multi-stage validation framework for trustworthy large-scale clinical information extraction using LLMs, enabling rigorous assessment without extensive manual annotation.
Contribution
It presents a novel multi-stage validation approach combining prompt calibration, plausibility filtering, semantic grounding, and expert review for scalable clinical data extraction.
Findings
Rule-based filtering removed 14.59% of unsupported extractions.
Judge LLM assessments showed Gwet's AC1=0.80 with experts.
Extracted SUD diagnoses predicted care engagement with AUC=0.80.
Abstract
Large language models (LLMs) show promise for extracting clinically meaningful information from unstructured health records, yet their translation into real-world settings is constrained by the lack of scalable and trustworthy validation approaches. Conventional evaluation methods rely heavily on annotation-intensive reference standards or incomplete structured data, limiting feasibility at population scale. We propose a multi-stage validation framework for LLM-based clinical information extraction that enables rigorous assessment under weak supervision. The framework integrates prompt calibration, rule-based plausibility filtering, semantic grounding assessment, targeted confirmatory evaluation using an independent higher-capacity judge LLM, selective expert review, and external predictive validity analysis to quantify uncertainty and characterize error modes without exhaustive manual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
