Interpretable Text-Guided Image Clustering via Iterative Search
Bingchen Zhao, Oisin Mac Aodha

TL;DR
This paper introduces ITGC, an iterative, text-guided image clustering method that generates interpretable visual concepts aligned with user instructions, improving clustering performance across various benchmarks.
Contribution
ITGC is a novel iterative approach that incorporates natural language instructions to produce interpretable, user-aligned image clusters, addressing ambiguity in traditional clustering.
Findings
Outperforms existing text-guided clustering methods on multiple benchmarks.
Generates interpretable visual concepts aligned with user instructions.
Effective in fine-grained classification tasks.
Abstract
Traditional clustering methods aim to group unlabeled data points based on their similarity to each other. However, clustering, in the absence of additional information, is an ill-posed problem as there may be many different, yet equally valid, ways to partition a dataset. Distinct users may want to use different criteria to form clusters in the same data, e.g. shape v.s. color. Recently introduced text-guided image clustering methods aim to address this ambiguity by allowing users to specify the criteria of interest using natural language instructions. This instruction provides the necessary context and control needed to obtain clusters that are more aligned with the users' intent. We propose a new text-guided clustering approach named ITGC that uses an iterative discovery process, guided by an unsupervised clustering objective, to generate interpretable visual concepts that better…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques
