Hybrid Routing Transformer for Zero-Shot Learning

De Cheng; Gerong Wang; Bo Wang; Qiang Zhang; Jungong Han; Dingwen; Zhang

arXiv:2203.15310·cs.CV·March 30, 2022

Hybrid Routing Transformer for Zero-Shot Learning

De Cheng, Gerong Wang, Bo Wang, Qiang Zhang, Jungong Han, Dingwen, Zhang

PDF

Open Access

TL;DR

This paper introduces a hybrid routing transformer (HRT) model for zero-shot learning that combines top-down and bottom-up attention with dynamic and static routing to better align visual features with semantic attributes, improving unseen class recognition.

Contribution

The paper proposes a novel transformer architecture with hybrid routing pathways that enhances semantic alignment in zero-shot learning tasks.

Findings

01

HRT outperforms existing methods on CUB, SUN, and AWA2 datasets.

02

The hybrid routing approach improves attribute-visual feature correlation.

03

Experimental results demonstrate the effectiveness of the proposed model.

Abstract

Zero-shot learning (ZSL) aims to learn models that can recognize unseen image semantics based on the training of data with seen semantics. Recent studies either leverage the global image features or mine discriminative local patch features to associate the extracted visual features to the semantic attributes. However, due to the lack of the necessary top-down guidance and semantic alignment for ensuring the model attending to the real attribute-correlation regions, these methods still encounter a significant semantic gap between the visual modality and the attribute modality, which makes their prediction on unseen semantics unreliable. To solve this problem, this paper establishes a novel transformer encoder-decoder model, called hybrid routing transformer (HRT). In HRT encoder, we embed an active attention, which is constructed by both the bottom-up and the top-down dynamic routing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI · Dental Research and COVID-19

MethodsAttention Is All You Need · Linear Layer · Routing Attention · Residual Connection · Softmax · Dropout · Dense Connections · Multi-Head Attention · Layer Normalization · Adam