Transductive Zero-Shot Learning using Cross-Modal CycleGAN
Patrick Bordes, Eloi Zablocki, Benjamin Piwowarski, Patrick Gallinari

TL;DR
This paper introduces a novel Cross-Modal CycleGAN model for transductive zero-shot learning that effectively aligns unseen class labels with visual data, achieving state-of-the-art results on large-scale image and language tasks.
Contribution
The paper proposes a scalable CycleGAN-based approach for transductive ZSL that handles high class counts and aligns unseen labels with images using adversarial and cycle-consistency objectives.
Findings
Achieves state-of-the-art results on ImageNet T-ZSL
Validates effectiveness on language grounding tasks
Introduces zero-shot sentence-to-image matching on MS COCO
Abstract
In Computer Vision, Zero-Shot Learning (ZSL) aims at classifying unseen classes -- classes for which no matching training image exists. Most of ZSL works learn a cross-modal mapping between images and class labels for seen classes. However, the data distribution of seen and unseen classes might differ, causing a domain shift problem. Following this observation, transductive ZSL (T-ZSL) assumes that unseen classes and their associated images are known during training, but not their correspondence. As current T-ZSL approaches do not scale efficiently when the number of seen classes is high, we tackle this problem with a new model for T-ZSL based upon CycleGAN. Our model jointly (i) projects images on their seen class labels with a supervised objective and (ii) aligns unseen class labels and visual exemplars with adversarial and cycle-consistency objectives. We show the efficiency of our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Interpreting and Communication in Healthcare
MethodsResidual Connection · Convolution · Tanh Activation · Sigmoid Activation · *Communicated@Fast*How Do I Communicate to Expedia? · Instance Normalization · PatchGAN · Batch Normalization · GAN Least Squares Loss · Cycle Consistency Loss
