WARM-CAT: Warm-Started Test-Time Comprehensive Knowledge Accumulation for Compositional Zero-Shot Learning
Xudong Yan, Songhe Feng, Jiaxin Wang, Xin Su, Yi Jin

TL;DR
WARM-CAT introduces a test-time knowledge accumulation method with adaptive prototype updates and a dynamic queue, improving compositional zero-shot learning performance by leveraging multimodal data and a new benchmark dataset.
Contribution
It proposes a novel test-time learning approach that accumulates multimodal knowledge, uses adaptive updates, and introduces a dynamic queue for better CZSL performance.
Findings
Achieves state-of-the-art results on four benchmark datasets.
Effectively handles distribution shifts during testing.
Introduces the C-Fashion dataset for more reliable evaluation.
Abstract
Compositional Zero-Shot Learning (CZSL) aims to recognize novel attribute-object compositions based on the knowledge learned from seen ones. Existing methods suffer from performance degradation caused by the distribution shift of label space at test time, which stems from the inclusion of unseen compositions recombined from attributes and objects. To overcome the challenge, we propose a novel approach that accumulates comprehensive knowledge in both textual and visual modalities from unsupervised data to update multimodal prototypes at test time. Building on this, we further design an adaptive update weight to control the degree of prototype adjustment, enabling the model to flexibly adapt to distribution shift during testing. Moreover, a dynamic priority queue is introduced that stores high-confidence images to acquire visual prototypes from historical images for inference. Since the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Multimodal Machine Learning Applications
