Mining Electronic Health Records to Investigate Effectiveness of Ensemble Deep Clustering

Manar D. Samad; Yina Hou; Shrabani Ghosh

arXiv:2604.07085·cs.LG·April 9, 2026

Mining Electronic Health Records to Investigate Effectiveness of Ensemble Deep Clustering

Manar D. Samad, Yina Hou, Shrabani Ghosh

PDF

TL;DR

This study evaluates various clustering methods on EHR data for heart failure patients, introducing an ensemble deep clustering approach that improves performance by combining multiple embeddings and traditional methods.

Contribution

It proposes an ensemble-based deep clustering method that aggregates multiple embeddings, outperforming individual clustering techniques on real EHR data.

Findings

01

Traditional methods perform robustly on EHR data.

02

Deep clustering benefits from ensemble aggregation of multiple embeddings.

03

Combining traditional and deep clustering improves overall performance.

Abstract

In electronic health records (EHRs), clustering patients and distinguishing disease subtypes are key tasks to elucidate pathophysiology and aid clinical decision-making. However, clustering in healthcare informatics is still based on traditional methods, especially K-means, and has achieved limited success when applied to embedding representations learned by autoencoders as hybrid methods. This paper investigates the effectiveness of traditional, hybrid, and deep learning methods in heart failure patient cohorts using real EHR data from the All of Us Research Program. Traditional clustering methods perform robustly because deep learning approaches are specifically designed for image clustering, a task that differs substantially from the tabular EHR data setting. To address the shortcomings of deep clustering, we introduce an ensemble-based deep clustering approach that aggregates…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.