Identifying Patient-Specific Root Causes of Disease
Eric V. Strobl, Thomas A. Lasko

TL;DR
This paper introduces a novel method to identify patient-specific root causes of disease by analyzing exogenous shocks in a structural equation model, improving individual-level diagnostic accuracy.
Contribution
It develops a fast algorithm called Root Causal Inference that extracts error terms from a linear SEM and computes Shapley values to pinpoint patient-specific disease causes.
Findings
Significant accuracy improvements in identifying individual root causes.
Uncovers causes with large individual effects but small group effects.
Provides an open-source R implementation for practical use.
Abstract
Complex diseases are caused by a multitude of factors that may differ between patients. As a result, hypothesis tests comparing all patients to all healthy controls can detect many significant variables with inconsequential effect sizes. A few highly predictive root causes may nevertheless generate disease within each patient. In this paper, we define patient-specific root causes as variables subject to exogenous "shocks" which go on to perturb an otherwise healthy system and induce disease. In other words, the variables are associated with the exogenous errors of a structural equation model (SEM), and these errors predict a downstream diagnostic label. We quantify predictivity using sample-specific Shapley values. This derivation allows us to develop a fast algorithm called Root Causal Inference for identifying patient-specific root causes by extracting the error terms of a linear SEM…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Causal Inference Techniques · Health Systems, Economic Evaluations, Quality of Life
