Hierarchical Textual Knowledge for Enhanced Image Clustering
Yijie Zhong, Yunfan Gao, Weipeng Jiang, Haofen Wang

TL;DR
This paper introduces a hierarchical textual knowledge approach using large language models to improve image clustering by incorporating rich semantic information beyond visual features.
Contribution
It proposes a knowledge-enhanced clustering method that constructs hierarchical concept-attribute structured knowledge to guide image clustering, outperforming existing methods.
Findings
KEC improves clustering accuracy across 20 datasets.
KEC without training surpasses zero-shot CLIP on 14 datasets.
Naive textual knowledge use can harm performance, but KEC is robust.
Abstract
Image clustering aims to group images in an unsupervised fashion. Traditional methods focus on knowledge from visual space, making it difficult to distinguish between visually similar but semantically different classes. Recent advances in vision-language models enable the use of textual knowledge to enhance image clustering. However, most existing methods rely on coarse class labels or simple nouns, overlooking the rich conceptual and attribute-level semantics embedded in textual space. In this paper, we propose a knowledge-enhanced clustering (KEC) method that constructs a hierarchical concept-attribute structured knowledge with the help of large language models (LLMs) to guide clustering. Specifically, we first condense redundant textual labels into abstract concepts and then automatically extract discriminative attributes for each single concept and similar concept pairs, via…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
