Group-based Learning of Disentangled Representations with   Generalizability for Novel Contents

Haruo Hosoya

arXiv:1809.02383·cs.LG·January 26, 2021

Group-based Learning of Disentangled Representations with Generalizability for Novel Contents

Haruo Hosoya

PDF

Open Access

TL;DR

This paper introduces a group-based variational autoencoder that learns disentangled, content-specific representations from unlabeled data, enabling generalization to unseen contents and achieving high separation of content and transformation factors.

Contribution

The novel model learns disentangled representations without explicit labels, allowing for content generalization and improved separation of content and transformation factors.

Findings

01

Successfully learned content representations that are separate from transformations.

02

Achieved generalization to unseen contents across five datasets.

03

Provided insights into the model's invariance and disentanglement mechanisms.

Abstract

Sensory data are often comprised of independent content and transformation factors. For example, face images may have shapes as content and poses as transformation. To infer separately these factors from given data, various ``disentangling'' models have been proposed. However, many of these are supervised or semi-supervised, either requiring attribute labels that are often unavailable or disallowing for generalization over new contents. In this study, we introduce a novel deep generative model, called group-based variational autoencoders. In this, we assume no explicit labels, but a weaker form of structure that groups together data instances having the same content but transformed differently; we thereby separately estimate a group-common factor as content and an instance-specific factor as transformation. This approach allows for learning to represent a general continuous space of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Digital Media Forensic Detection · Music and Audio Processing