TL;DR
This paper introduces implicit and explicit attention mechanisms in Zero-Shot Learning to mitigate bias towards seen classes, utilizing self-supervised tasks and Vision Transformers, achieving state-of-the-art results on multiple benchmarks.
Contribution
It presents novel implicit and explicit attention strategies for ZSL, combining self-supervised learning and Vision Transformers to improve bias mitigation and semantic mapping.
Findings
Achieved state-of-the-art harmonic mean on AWA2, CUB, and SUN datasets.
Implicit attention via self-supervised rotation task enhances feature focus.
Explicit attention with Vision Transformer improves semantic feature mapping.
Abstract
Most of the existing Zero-Shot Learning (ZSL) methods focus on learning a compatibility function between the image representation and class attributes. Few others concentrate on learning image representation combining local and global features. However, the existing approaches still fail to address the bias issue towards the seen classes. In this paper, we propose implicit and explicit attention mechanisms to address the existing bias problem in ZSL models. We formulate the implicit attention mechanism with a self-supervised image angle rotation task, which focuses on specific image features aiding to solve the task. The explicit attention mechanism is composed with the consideration of a multi-headed self-attention mechanism via Vision Transformer model, which learns to map image features to semantic space during the training stage. We conduct comprehensive experiments on three popular…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsAttention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Adam · Residual Connection · Byte Pair Encoding · Dropout · Dense Connections · Label Smoothing · Softmax
