DeepMCAT: Large-Scale Deep Clustering for Medical Image Categorization
Turkay Kart, Wenjia Bai, Ben Glocker, Daniel Rueckert

TL;DR
DeepMCAT introduces an unsupervised clustering method for large-scale medical images, achieving high purity and enabling effective organization of medical databases without requiring labeled data.
Contribution
The paper presents a novel unsupervised clustering approach for medical images that does not rely on labels, addressing challenges of label scarcity and bias.
Findings
Achieved over 0.99 cluster purity on large-scale datasets
Effective for both class-balanced and imbalanced datasets
Potential to organize large medical image repositories
Abstract
In recent years, the research landscape of machine learning in medical imaging has changed drastically from supervised to semi-, weakly- or unsupervised methods. This is mainly due to the fact that ground-truth labels are time-consuming and expensive to obtain manually. Generating labels from patient metadata might be feasible but it suffers from user-originated errors which introduce biases. In this work, we propose an unsupervised approach for automatically clustering and categorizing large-scale medical image datasets, with a focus on cardiac MR images, and without using any labels. We investigated the end-to-end training using both class-balanced and imbalanced large-scale datasets. Our method was able to create clusters with high purity and achieved over 0.99 cluster purity on these datasets. The results demonstrate the potential of the proposed method for categorizing unstructured…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
