Correlation vs causation in Alzheimer's disease: an interpretability-driven study
Hamzah Dabool, Raghad Mustafa

TL;DR
This study combines correlation analysis, machine learning, and interpretability techniques to differentiate between causation and correlation in Alzheimer's disease features, aiming to improve diagnosis and understanding of disease mechanisms.
Contribution
It introduces an integrated approach using XGBoost and SHAP values to interpret feature importance, emphasizing the distinction between correlation and causation in AD research.
Findings
Identified key features influencing AD classification, including cognitive scores and genetic factors.
Revealed clusters of interrelated variables through correlation matrices.
Highlighted that correlation does not imply causation, guiding future causal inference studies.
Abstract
Understanding the distinction between causation and correlation is critical in Alzheimer's disease (AD) research, as it impacts diagnosis, treatment, and the identification of true disease drivers. This experiment investigates the relationships among clinical, cognitive, genetic, and biomarker features using a combination of correlation analysis, machine learning classification, and model interpretability techniques. Employing the XGBoost algorithm, we identified key features influencing AD classification, including cognitive scores and genetic risk factors. Correlation matrices revealed clusters of interrelated variables, while SHAP (SHapley Additive exPlanations) values provided detailed insights into feature contributions across disease stages. Our results highlight that strong correlations do not necessarily imply causation, emphasizing the need for careful interpretation of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetic Associations and Epidemiology · Dementia and Cognitive Impairment Research · Bioinformatics and Genomic Networks
MethodsShapley Additive Explanations · Causal inference
