Robustly Extracting Medical Knowledge from EHRs: A Case Study of Learning a Health Knowledge Graph
Irene Y. Chen, Monica Agrawal, Steven Horng, David Sontag

TL;DR
This paper evaluates the robustness of health knowledge graphs derived from electronic health records, identifying key sources of error and proposing methods to improve their accuracy and generalizability for medical diagnosis.
Contribution
It introduces a framework for assessing the robustness of medical knowledge graphs from EHRs and proposes methods to enhance their reliability and applicability.
Findings
Sample size and unmeasured confounders are major error sources.
Non-linear functions improve causal graph understanding.
Model generalizability extends to larger patient datasets.
Abstract
Increasingly large electronic health records (EHRs) provide an opportunity to algorithmically learn medical knowledge. In one prominent example, a causal health knowledge graph could learn relationships between diseases and symptoms and then serve as a diagnostic tool to be refined with additional clinical input. Prior research has demonstrated the ability to construct such a graph from over 270,000 emergency department patient visits. In this work, we describe methods to evaluate a health knowledge graph for robustness. Moving beyond precision and recall, we analyze for which diseases and for which patients the graph is most accurate. We identify sample size and unmeasured confounders as major sources of error in the health knowledge graph. We introduce a method to leverage non-linear functions in building the causal graph to better understand existing model assumptions. Finally, to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
