TransZero: Attribute-guided Transformer for Zero-Shot Learning
Shiming Chen, Ziming Hong, Yang Liu, Guo-Sen Xie, Baigui Sun, Hao Li,, Qinmu Peng, Ke Lu, Xinge You

TL;DR
TransZero introduces an attribute-guided Transformer that enhances zero-shot learning by improving attribute localization and transferability of visual features, achieving state-of-the-art results on multiple benchmarks.
Contribution
The paper proposes TransZero, a novel Transformer-based model that refines visual features and localizes attributes to improve zero-shot learning performance.
Findings
Achieves state-of-the-art results on three ZSL benchmarks.
Effectively localizes attributes within images.
Improves transferability of visual features across datasets.
Abstract
Zero-shot learning (ZSL) aims to recognize novel classes by transferring semantic knowledge from seen classes to unseen ones. Semantic knowledge is learned from attribute descriptions shared between different classes, which act as strong priors for localizing object attributes that represent discriminative region features, enabling significant visual-semantic interaction. Although some attention-based models have attempted to learn such region features in a single image, the transferability and discriminative attribute localization of visual features are typically neglected. In this paper, we propose an attribute-guided Transformer network, termed TransZero, to refine visual features and learn attribute localization for discriminative visual embedding representations in ZSL. Specifically, TransZero takes a feature augmentation encoder to alleviate the cross-dataset bias between ImageNet…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI · Multimodal Machine Learning Applications
MethodsAttention Is All You Need · Linear Layer · Adam · Softmax · Residual Connection · Dropout · Position-Wise Feed-Forward Layer · Layer Normalization · Dense Connections · Byte Pair Encoding
