Clustering via Self-Supervised Diffusion

Roy Uziel; Irit Chelly; Oren Freifeld; Ari Pakman

arXiv:2507.04283·cs.AI·July 31, 2025

Clustering via Self-Supervised Diffusion

Roy Uziel, Irit Chelly, Oren Freifeld, Ari Pakman

PDF

Open Access

TL;DR

This paper introduces CLUDI, a novel self-supervised clustering framework that leverages diffusion models and pre-trained Vision Transformers to improve clustering accuracy and robustness in high-dimensional data.

Contribution

The paper presents a new diffusion-based clustering method combining generative diffusion models with Vision Transformer features, using a teacher-student paradigm for stable and accurate clustering.

Findings

01

Achieves state-of-the-art clustering performance on challenging datasets.

02

Demonstrates robustness and adaptability to complex data distributions.

03

Introduces a stochastic diffusion-based data augmentation strategy.

Abstract

Diffusion models, widely recognized for their success in generative tasks, have not yet been applied to clustering. We introduce Clustering via Diffusion (CLUDI), a self-supervised framework that combines the generative power of diffusion models with pre-trained Vision Transformer features to achieve robust and accurate clustering. CLUDI is trained via a teacher-student paradigm: the teacher uses stochastic diffusion-based sampling to produce diverse cluster assignments, which the student refines into stable predictions. This stochasticity acts as a novel data augmentation strategy, enabling CLUDI to uncover intricate structures in high-dimensional data. Extensive evaluations on challenging datasets demonstrate that CLUDI achieves state-of-the-art performance in unsupervised classification, setting new benchmarks in clustering robustness and adaptability to complex data distributions.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Clustering Algorithms Research