Wasserstein-Wasserstein Auto-Encoders

Shunkang Zhang; Yuan Gao; Yuling Jiao; Jin Liu; Yang Wang; Can Yang

arXiv:1902.09323·cs.LG·February 26, 2019·5 cites

Wasserstein-Wasserstein Auto-Encoders

Shunkang Zhang, Yuan Gao, Yuling Jiao, Jin Liu, Yang Wang, Can Yang

PDF

Open Access

TL;DR

The paper introduces Wasserstein-Wasserstein auto-encoders (WWAE), a deep generative model that minimizes penalized optimal transport using closed-form Wasserstein-2 distances for Gaussians, leading to improved sample quality and latent structure learning.

Contribution

WWAE is a novel deep generative model that combines optimal transport with Gaussian assumptions for efficient training and improved generative performance.

Findings

01

WWAE outperforms VAEs and GANs in sample quality and FID scores.

02

WWAE learns better latent representations than VAEs.

03

The model is computationally efficient due to the closed-form Wasserstein-2 distance.

Abstract

To address the challenges in learning deep generative models (e.g.,the blurriness of variational auto-encoder and the instability of training generative adversarial networks, we propose a novel deep generative model, named Wasserstein-Wasserstein auto-encoders (WWAE). We formulate WWAE as minimization of the penalized optimal transport between the target distribution and the generated distribution. By noticing that both the prior $P_{Z}$ and the aggregated posterior $Q_{Z}$ of the latent code Z can be well captured by Gaussians, the proposed WWAE utilizes the closed-form of the squared Wasserstein-2 distance for two Gaussians in the optimization process. As a result, WWAE does not suffer from the sampling burden and it is computationally efficient by leveraging the reparameterization trick. Numerical results evaluated on multiple benchmark datasets including MNIST, fashion- MNIST and CelebA…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Human Pose and Action Recognition