Learning Concise and Descriptive Attributes for Visual Recognition
An Yan, Yu Wang, Yiwu Zhong, Chengyu Dong, Zexue He, Yujie Lu, William, Wang, Jingbo Shang, Julian McAuley

TL;DR
This paper introduces a learning-to-search method to identify small, effective attribute sets for visual recognition, improving interpretability and efficiency over large attribute collections generated by language models.
Contribution
The paper proposes a novel approach to select concise attribute subsets that maintain high classification performance, reducing noise and enhancing interpretability in visual recognition tasks.
Findings
Achieves near state-of-the-art performance with only 32 attributes on CUB dataset.
Large sets of LLM-generated attributes contain significant noise and redundancy.
Concise attribute sets improve interpretability and interactivity for human users.
Abstract
Recent advances in foundation models present new opportunities for interpretable visual recognition -- one can first query Large Language Models (LLMs) to obtain a set of attributes that describe each class, then apply vision-language models to classify images via these attributes. Pioneering work shows that querying thousands of attributes can achieve performance competitive with image features. However, our further investigation on 8 datasets reveals that LLM-generated attributes in a large quantity perform almost the same as random words. This surprising finding suggests that significant noise may be present in these attributes. We hypothesize that there exist subsets of attributes that can maintain the classification performance with much smaller sizes, and propose a novel learning-to-search method to discover those concise sets of attributes. As a result, on the CUB dataset, our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Explainable Artificial Intelligence (XAI)
