Self-Enhanced Image Clustering with Cross-Modal Semantic Consistency
Zihan Li, Wei Sun, Jing Hu, Jianhua Yin, Jianlong Wu, Liqiang Nie

TL;DR
This paper introduces a self-enhanced image clustering framework leveraging cross-modal semantic consistency to improve the alignment of pre-trained models with clustering tasks, significantly boosting performance across multiple datasets.
Contribution
It proposes a novel two-stage framework combining cross-modal semantic consistency and self-enhancement to improve image clustering accuracy using pre-trained models.
Findings
Outperforms existing deep clustering methods on six datasets
Achieves or surpasses state-of-the-art accuracy with smaller models
Effectively aligns pre-trained features with clustering objectives
Abstract
While large language-image pre-trained models like CLIP offer powerful generic features for image clustering, existing methods typically freeze the encoder. This creates a fundamental mismatch between the model's task-agnostic representations and the demands of a specific clustering task, imposing a ceiling on performance. To break this ceiling, we propose a self-enhanced framework based on cross-modal semantic consistency for efficient image clustering. Our framework first builds a strong foundation via Cross-Modal Semantic Consistency and then specializes the encoder through Self-Enhancement. In the first stage, we focus on Cross-Modal Semantic Consistency. By mining consistency between generated image-text pairs at the instance, cluster assignment, and cluster center levels, we train lightweight clustering heads to align with the rich semantics of the pre-trained model. This…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · COVID-19 diagnosis using AI
