Generative Model-driven Structure Aligning Discriminative Embeddings for Transductive Zero-shot Learning
Omkar Gune, Mainak Pal, Preeti Mukherjee, Biplab Banerjee, Subhasis, Chaudhuri

TL;DR
This paper introduces a neural network-based model for zero-shot learning that aligns visual and semantic data in a discriminative latent space, utilizing transductive learning to mitigate domain shift and improve performance on standard benchmarks.
Contribution
A novel transductive approach using a conditional variational auto-encoder to generate semantic features for unseen classes, enhancing zero-shot learning performance.
Findings
Superior performance on benchmark datasets AWA1, AWA2, CUB, SUN, FLO, and APY.
Effective in low labeled data regimes for ZSL.
Reduces projection domain shift using unlabeled unseen class data.
Abstract
Zero-shot Learning (ZSL) is a transfer learning technique which aims at transferring knowledge from seen classes to unseen classes. This knowledge transfer is possible because of underlying semantic space which is common to seen and unseen classes. Most existing approaches learn a projection function using labelled seen class data which maps visual data to semantic data. In this work, we propose a shallow but effective neural network-based model for learning such a projection function which aligns the visual and semantic data in the latent space while simultaneously making the latent space embeddings discriminative. As the above projection function is learned using the seen class data, the so-called projection domain shift exists. We propose a transductive approach to reduce the effect of domain shift, where we utilize unlabeled visual data from unseen classes to generate corresponding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Human Pose and Action Recognition
