Disentangling Semantic-to-visual Confusion for Zero-shot Learning

Zihan Ye; Fuyuan Hu; Fan Lyu; Linyan Li; Kaizhu Huang

arXiv:2106.08605·cs.CV·June 17, 2021

Disentangling Semantic-to-visual Confusion for Zero-shot Learning

Zihan Ye, Fuyuan Hu, Fan Lyu, Linyan Li, Kaizhu Huang

PDF

1 Repo

TL;DR

This paper introduces a novel multi-modal triplet loss and a disentangling class representation GAN to improve zero-shot learning by better disentangling and synthesizing visual features for both seen and unseen classes.

Contribution

It proposes a multi-modal triplet loss and a disentangling GAN framework to enhance feature disentanglement and synthesis in zero-shot learning.

Findings

01

Achieves superior performance on four benchmark datasets.

02

Effectively disentangles class representations for better generalization.

03

Outperforms state-of-the-art methods in ZSL tasks.

Abstract

Using generative models to synthesize visual features from semantic distribution is one of the most popular solutions to ZSL image classification in recent years. The triplet loss (TL) is popularly used to generate realistic visual distributions from semantics by automatically searching discriminative representations. However, the traditional TL cannot search reliable unseen disentangled representations due to the unavailability of unseen classes in ZSL. To alleviate this drawback, we propose in this work a multi-modal triplet loss (MMTL) which utilizes multimodal information to search a disentangled representation space. As such, all classes can interplay which can benefit learning disentangled class representations in the searched space. Furthermore, we develop a novel model called Disentangling Class Representation Generative Adversarial Network (DCR-GAN) focusing on exploiting the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

FouriYe/DCRGAN-TMM
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsTriplet Loss