TOMCAT: Test-time Comprehensive Knowledge Accumulation for Compositional Zero-Shot Learning
Xudong Yan, Songhe Feng

TL;DR
TOMCAT introduces a test-time knowledge accumulation method for compositional zero-shot learning, leveraging multimodal data and adaptive updates to improve recognition of unseen attribute-object pairs.
Contribution
The paper proposes a novel test-time approach that accumulates multimodal knowledge and adaptively updates prototypes to handle distribution shifts in CZSL.
Findings
Achieves state-of-the-art results on four benchmark datasets.
Effective in both closed-world and open-world settings.
Demonstrates robustness to distribution shifts during testing.
Abstract
Compositional Zero-Shot Learning (CZSL) aims to recognize novel attribute-object compositions based on the knowledge learned from seen ones. Existing methods suffer from performance degradation caused by the distribution shift of label space at test time, which stems from the inclusion of unseen compositions recombined from attributes and objects. To overcome the challenge, we propose a novel approach that accumulates comprehensive knowledge in both textual and visual modalities from unsupervised data to update multimodal prototypes at test time. Building on this, we further design an adaptive update weight to control the degree of prototype adjustment, enabling the model to flexibly adapt to distribution shift during testing. Moreover, a dynamic priority queue is introduced that stores high-confidence images to acquire visual knowledge from historical images for inference. Considering…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · COVID-19 diagnosis using AI
