Learning disentangled representations with the Wasserstein Autoencoder

Benoit Gaujac; Ilya Feige; David Barber

arXiv:2010.03459·stat.ML·October 9, 2020

Learning disentangled representations with the Wasserstein Autoencoder

Benoit Gaujac, Ilya Feige, David Barber

PDF

TL;DR

This paper introduces TCWAE, a Wasserstein Autoencoder variant that improves disentangled representation learning by controlling total correlation, balancing reconstruction quality and disentanglement, and demonstrating competitive results on various datasets.

Contribution

The paper proposes TCWAE, a novel WAE-based model that explicitly controls total correlation for better disentanglement and offers flexible reconstruction options, advancing disentangled representation learning.

Findings

01

TCWAE achieves competitive disentanglement scores.

02

Flexible reconstruction improves on complex datasets.

03

Explicit total correlation control enhances disentanglement quality.

Abstract

Disentangled representation learning has undoubtedly benefited from objective function surgery. However, a delicate balancing act of tuning is still required in order to trade off reconstruction fidelity versus disentanglement. Building on previous successes of penalizing the total correlation in the latent variables, we propose TCWAE (Total Correlation Wasserstein Autoencoder). Working in the WAE paradigm naturally enables the separation of the total-correlation term, thus providing disentanglement control over the learned representation, while offering more flexibility in the choice of reconstruction cost. We propose two variants using different KL estimators and perform extensive quantitative comparisons on data sets with known generative factors, showing competitive results relative to state-of-the-art techniques. We further study the trade off between disentanglement and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.