A study linking patient EHR data to external death data at Stanford Medicine
Alvaro Andres Alvarez Peralta, Priya Desai, Somalee Datta

TL;DR
This study develops and evaluates methods for linking patient EHR data with external death records, revealing challenges in linkage accuracy and the need for multiple data sources for comprehensive death data in research repositories.
Contribution
The paper presents a generalizable framework for linking real-world patient data across organizations and evaluates linkage rules, highlighting their variability and limitations.
Findings
Strong linkages are often incomplete.
Weak linkages tend to be noisy.
Different datasets require different linkage rules.
Abstract
This manuscript explores linking real-world patient data with external death data in the context of research Clinical Data Warehouses (r-CDWs). We specifically present the linking of Electronic Health Records (EHR) data for Stanford Health Care (SHC) patients and data from the Social Security Administration (SSA) Limited Access Death Master File (LADMF) made available by the US Department of Commerce's National Technical Information Service (NTIS). The data analysis framework presented in this manuscript extends prior approaches and is generalizable to linking any two cross-organizational real-world patient data sources. Electronic Health Record (EHR) data and NTIS LADMF are heavily used resources at other medical centers and we expect that the methods and learnings presented here will be valuable to others. Our findings suggest that strong linkages are incomplete and weak linkages…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Machine Learning in Healthcare · Artificial Intelligence in Healthcare
Methodstravel james
