Clustering Aware Classification for Risk Prediction and Subtyping in Clinical Data
Shivin Srivastava, Siddharth Bhatia, Lingxiao Huang, Lim Jun Heng,, Kenji Kawaguchi, Vaibhav Rajan

TL;DR
This paper introduces a novel clustering-aware classification method, DeepCAC, that improves risk prediction and subtyping in heterogeneous clinical data by leveraging deep learning to find optimal clusters for classifier training.
Contribution
The paper proposes a generic, deep learning-based clustering aware classification algorithm, DeepCAC, that enhances classifier performance on heterogeneous data by optimizing cluster formation.
Findings
DeepCAC outperforms previous methods on synthetic datasets.
DeepCAC demonstrates superior accuracy on real benchmark datasets.
Theoretical analysis shows conditions where clustering improves classification.
Abstract
In data containing heterogeneous subpopulations, classification performance benefits from incorporating the knowledge of cluster structure in the classifier. Previous methods for such combined clustering and classification either 1) are classifier-specific and not generic, or 2) independently perform clustering and classifier training, which may not form clusters that can potentially benefit classifier performance. The question of how to perform clustering to improve the performance of classifiers trained on the clusters has received scant attention in previous literature, despite its importance in several real-world applications. In this paper, first, we theoretically analyze the generalization performance of classifiers trained on clustered data and find conditions under which clustering can potentially aid classification. This motivates the design of a simple k-means-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare · Machine Learning in Healthcare · AI in cancer detection
MethodsAttentive Walk-Aggregating Graph Neural Network
