Clusters in Explanation Space: Inferring disease subtypes from model explanations
Marc-Andre Schulz, Matt Chapman-Rounds, Manisha Verma, Danilo Bzdok,, Konstantinos Georgatzis

TL;DR
This paper presents a novel method that uses explanations from a diagnostic classifier to identify disease subtypes, outperforming traditional clustering on raw data in high-dimensional biomedical datasets.
Contribution
The paper introduces a new approach that leverages explanation space from classifiers to discover disease subtypes more effectively than classical methods.
Findings
Cluster analysis on explanation space outperforms classical clustering on original data.
Method successfully identifies known disease subtypes in brain imaging and transcriptome datasets.
Approach is applicable to various subtype discovery tasks beyond biomedical data.
Abstract
Identification of disease subtypes and corresponding biomarkers can substantially improve clinical diagnosis and treatment selection. Discovering these subtypes in noisy, high dimensional biomedical data is often impossible for humans and challenging for machines. We introduce a new approach to facilitate the discovery of disease subtypes: Instead of analyzing the original data, we train a diagnostic classifier (healthy vs. diseased) and extract instance-wise explanations for the classifier's decisions. The distribution of instances in the explanation space of our diagnostic classifier amplifies the different reasons for belonging to the same class - resulting in a representation that is uniquely useful for discovering latent subtypes. We compare our ability to recover subtypes via cluster analysis on model explanations to classical cluster analysis on the original data. In multiple…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Cell Image Analysis Techniques · Bioinformatics and Genomic Networks
