TL;DR
TACCO is a novel self-supervised hypergraph co-clustering framework that jointly discovers disease subtypes by clustering clinical concepts and patient visits, improving disease risk prediction from EHR data.
Contribution
It introduces a hypergraph-based co-clustering method guided by disease prediction tasks, integrating textual embeddings and contrastive learning for better disease subtyping.
Findings
Achieved 31.25% performance improvement over traditional ML methods.
Improved cardiovascular risk prediction by 5.26% over baseline hypergraph models.
Validated utility through clinical case studies and detailed analysis.
Abstract
The growing availability of well-organized Electronic Health Records (EHR) data has enabled the development of various machine learning models towards disease risk prediction. However, existing risk prediction methods overlook the heterogeneity of complex diseases, failing to model the potential disease subtypes regarding their corresponding patient visits and clinical concept subgroups. In this work, we introduce TACCO, a novel framework that jointly discovers clusters of clinical concepts and patient visits based on a hypergraph modeling of EHR data. Specifically, we develop a novel self-supervised co-clustering framework that can be guided by the risk prediction task of specific diseases. Furthermore, we enhance the hypergraph model of EHR data with textual embeddings and enforce the alignment between the clusters of clinical concepts and patient visits through a contrastive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
