A novel method for Causal Structure Discovery from EHR data, a demonstration on type-2 diabetes mellitus
Xinpeng Shen, Sisi Ma, Prashanthi Vemuri, M. Regina Castro, Pedro J., Caraballo, Gyorgy J. Simon

TL;DR
This paper introduces a new data transformation and causal discovery algorithm tailored for EHR data, demonstrating improved accuracy and stability in uncovering disease mechanisms, specifically applied to type-2 diabetes mellitus.
Contribution
The paper presents a novel data transformation and causal structure discovery method that overcomes EHR data challenges, enhancing causal inference accuracy and robustness.
Findings
Improved causal graph correctness and edge orientation consistency.
Enhanced accuracy, stability, and completeness over existing methods.
Validated generalizability through external dataset validation.
Abstract
Introduction: The discovery of causal mechanisms underlying diseases enables better diagnosis, prognosis and treatment selection. Clinical trials have been the gold standard for determining causality, but they are resource intensive, sometimes infeasible or unethical. Electronic Health Records (EHR) contain a wealth of real-world data that holds promise for the discovery of disease mechanisms, yet the existing causal structure discovery (CSD) methods fall short on leveraging them due to the special characteristics of the EHR data. We propose a new data transformation method and a novel CSD algorithm to overcome the challenges posed by these characteristics. Materials and methods: We demonstrated the proposed methods on an application to type-2 diabetes mellitus. We used a large EHR data set from Mayo Clinic to internally evaluate the proposed transformation and CSD methods and used…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Machine Learning in Healthcare · Bioinformatics and Genomic Networks
