Generative Model-driven Structure Aligning Discriminative Embeddings for   Transductive Zero-shot Learning

Omkar Gune; Mainak Pal; Preeti Mukherjee; Biplab Banerjee; Subhasis; Chaudhuri

arXiv:2005.04492·cs.CV·May 12, 2020·1 cites

Generative Model-driven Structure Aligning Discriminative Embeddings for Transductive Zero-shot Learning

Omkar Gune, Mainak Pal, Preeti Mukherjee, Biplab Banerjee, Subhasis, Chaudhuri

PDF

Open Access

TL;DR

This paper introduces a neural network-based model for zero-shot learning that aligns visual and semantic data in a discriminative latent space, utilizing transductive learning to mitigate domain shift and improve performance on standard benchmarks.

Contribution

A novel transductive approach using a conditional variational auto-encoder to generate semantic features for unseen classes, enhancing zero-shot learning performance.

Findings

01

Superior performance on benchmark datasets AWA1, AWA2, CUB, SUN, FLO, and APY.

02

Effective in low labeled data regimes for ZSL.

03

Reduces projection domain shift using unlabeled unseen class data.

Abstract

Zero-shot Learning (ZSL) is a transfer learning technique which aims at transferring knowledge from seen classes to unseen classes. This knowledge transfer is possible because of underlying semantic space which is common to seen and unseen classes. Most existing approaches learn a projection function using labelled seen class data which maps visual data to semantic data. In this work, we propose a shallow but effective neural network-based model for learning such a projection function which aligns the visual and semantic data in the latent space while simultaneously making the latent space embeddings discriminative. As the above projection function is learned using the seen class data, the so-called projection domain shift exists. We propose a transductive approach to reduce the effect of domain shift, where we utilize unlabeled visual data from unseen classes to generate corresponding…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Human Pose and Action Recognition