Visual-Augmented Dynamic Semantic Prototype for Generative Zero-Shot   Learning

Wenjin Hou; Shiming Chen; Shuhuang Chen; Ziming Hong; Yan Wang; Xuetao; Feng; Salman Khan; Fahad Shahbaz Khan; Xinge You

arXiv:2404.14808·cs.CV·April 24, 2024

Visual-Augmented Dynamic Semantic Prototype for Generative Zero-Shot Learning

Wenjin Hou, Shiming Chen, Shuhuang Chen, Ziming Hong, Yan Wang, Xuetao, Feng, Salman Khan, Fahad Shahbaz Khan, Xinge You

PDF

Open Access

TL;DR

This paper introduces VADS, a novel method for generative zero-shot learning that leverages visual-augmented knowledge to improve semantic-visual mapping, resulting in better generalization to unseen classes.

Contribution

VADS integrates visual-aware domain knowledge learning and vision-oriented semantic updating to enhance zero-shot learning performance.

Findings

01

VADS outperforms state-of-the-art methods on SUN, CUB, and AWA2 datasets.

02

Achieves average improvements of 6.4%, 5.9%, and 4.2%.

03

Demonstrates superior CZSL and GZSL results.

Abstract

Generative Zero-shot learning (ZSL) learns a generator to synthesize visual samples for unseen classes, which is an effective way to advance ZSL. However, existing generative methods rely on the conditions of Gaussian noise and the predefined semantic prototype, which limit the generator only optimized on specific seen classes rather than characterizing each visual instance, resulting in poor generalizations (\textit{e.g.}, overfitting to seen classes). To address this issue, we propose a novel Visual-Augmented Dynamic Semantic prototype method (termed VADS) to boost the generator to learn accurate semantic-visual mapping by fully exploiting the visual-augmented knowledge into semantic conditions. In detail, VADS consists of two modules: (1) Visual-aware Domain Knowledge Learning module (VDKL) learns the local bias and global prior of the visual features (referred to as domain visual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning