Wasserstein-Wasserstein Auto-Encoders
Shunkang Zhang, Yuan Gao, Yuling Jiao, Jin Liu, Yang Wang, Can Yang

TL;DR
The paper introduces Wasserstein-Wasserstein auto-encoders (WWAE), a deep generative model that minimizes penalized optimal transport using closed-form Wasserstein-2 distances for Gaussians, leading to improved sample quality and latent structure learning.
Contribution
WWAE is a novel deep generative model that combines optimal transport with Gaussian assumptions for efficient training and improved generative performance.
Findings
WWAE outperforms VAEs and GANs in sample quality and FID scores.
WWAE learns better latent representations than VAEs.
The model is computationally efficient due to the closed-form Wasserstein-2 distance.
Abstract
To address the challenges in learning deep generative models (e.g.,the blurriness of variational auto-encoder and the instability of training generative adversarial networks, we propose a novel deep generative model, named Wasserstein-Wasserstein auto-encoders (WWAE). We formulate WWAE as minimization of the penalized optimal transport between the target distribution and the generated distribution. By noticing that both the prior and the aggregated posterior of the latent code Z can be well captured by Gaussians, the proposed WWAE utilizes the closed-form of the squared Wasserstein-2 distance for two Gaussians in the optimization process. As a result, WWAE does not suffer from the sampling burden and it is computationally efficient by leveraging the reparameterization trick. Numerical results evaluated on multiple benchmark datasets including MNIST, fashion- MNIST and CelebA…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Human Pose and Action Recognition
