Progressive Semantic-Visual Mutual Adaption for Generalized Zero-Shot   Learning

Man Liu; Feng Li; Chunjie Zhang; Yunchao Wei; Huihui Bai; Yao Zhao

arXiv:2303.15322·cs.CV·March 28, 2023·1 cites

Progressive Semantic-Visual Mutual Adaption for Generalized Zero-Shot Learning

Man Liu, Feng Li, Chunjie Zhang, Yunchao Wei, Huihui Bai, Yao Zhao

PDF

Open Access 1 Repo

TL;DR

This paper introduces a progressive semantic-visual mutual adaptation network using dual transformer modules to improve semantic disambiguation and knowledge transfer in generalized zero-shot learning, addressing semantic ambiguity and bias issues.

Contribution

The paper proposes a novel PSVMA network with DSVTM modules for better semantic-visual interaction modeling and bias mitigation in GZSL, advancing the state-of-the-art performance.

Findings

01

Outperforms existing GZSL methods on benchmark datasets.

02

Effectively reduces semantic ambiguity in attribute-based recognition.

03

Improves generalization to unseen categories.

Abstract

Generalized Zero-Shot Learning (GZSL) identifies unseen categories by knowledge transferred from the seen domain, relying on the intrinsic interactions between visual and semantic information. Prior works mainly localize regions corresponding to the sharing attributes. When various visual appearances correspond to the same attribute, the sharing attributes inevitably introduce semantic ambiguity, hampering the exploration of accurate semantic-visual interactions. In this paper, we deploy the dual semantic-visual transformer module (DSVTM) to progressively model the correspondences between attribute prototypes and visual features, constituting a progressive semantic-visual mutual adaption (PSVMA) network for semantic disambiguation and knowledge transferability improvement. Specifically, DSVTM devises an instance-motivated semantic encoder that learns instance-centric prototypes to adapt…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

manliucoder/psvma
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Mycobacterium research and diagnosis