When Documents Disagree: Measuring Institutional Variation in Transplant Guidance with Retrieval-Augmented Language Models
Yubo Li, Ramayya Krishnan, Rema Padman

TL;DR
This paper presents a framework using retrieval-augmented language models to quantify heterogeneity and gaps in transplant patient education materials across U.S. centers, revealing significant variation and content omissions.
Contribution
It introduces a systematic method to measure institutional variation in medical guidance documents using language models and a consistency taxonomy.
Findings
20.8% of comparisons show meaningful divergence
96.2% of question pairs miss relevant content
Heterogeneity reflects systematic institutional differences
Abstract
Patient education materials for solid-organ transplantation vary substantially across U.S. centers, yet no systematic method exists to quantify this heterogeneity at scale. We introduce a framework that grounds the same patient questions in different centers' handbooks using retrieval-augmented language models and compares the resulting answers using a five-label consistency taxonomy. Applied to 102 handbooks from 23 centers and 1,115 benchmark questions, the framework quantifies heterogeneity across four dimensions: question, topic, organ, and center. We find that 20.8% of non-absent pairwise comparisons exhibit clinically meaningful divergence, concentrated in condition monitoring and lifestyle topics. Coverage gaps are even more prominent: 96.2% of question-handbook pairs miss relevant content, with reproductive health at 95.1% absence. Center-level divergence profiles are stable and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Health Literacy and Information Accessibility · Information Retrieval and Search Behavior
