Leveraging the Invariant Side of Generative Zero-Shot Learning
Jingjing Li, Mengmeng Jin, Ke Lu, Zhengming Ding, Lei Zhu, Zi Huang

TL;DR
This paper introduces LisGAN, a generative zero-shot learning method that synthesizes unseen features conditioned on semantic descriptions, utilizing invariant soul samples to improve recognition accuracy.
Contribution
The paper proposes a novel invariant side regularization using soul samples in GAN-based ZSL, enhancing the generation of semantically meaningful features for unseen classes.
Findings
Outperforms state-of-the-art on five benchmarks
Uses a cascade classifier for coarse-to-fine recognition
Regularizes generated samples to be close to soul samples
Abstract
Conventional zero-shot learning (ZSL) methods generally learn an embedding, e.g., visual-semantic mapping, to handle the unseen visual samples via an indirect manner. In this paper, we take the advantage of generative adversarial networks (GANs) and propose a novel method, named leveraging invariant side GAN (LisGAN), which can directly generate the unseen features from random noises which are conditioned by the semantic descriptions. Specifically, we train a conditional Wasserstein GANs in which the generator synthesizes fake unseen features from noises and the discriminator distinguishes the fake from real via a minimax game. Considering that one semantic description can correspond to various synthesized visual samples, and the semantic description, figuratively, is the soul of the generated features, we introduce soul samples as the invariant side of generative zero-shot learning in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis
