Diagnosing and Fixing Manifold Overfitting in Deep Generative Models
Gabriel Loaiza-Ganem, Brendan Leigh Ross, Jesse C. Cresswell, Anthony, L. Caterini

TL;DR
This paper investigates the problem of manifold overfitting in deep generative models, proves its occurrence, and proposes a two-step approach to accurately learn data distributions on low-dimensional manifolds, improving density estimation.
Contribution
It formally defines manifold overfitting, proves its occurrence, and introduces a two-step method that avoids this issue and enhances density estimation on learned manifolds.
Findings
Manifold overfitting occurs in likelihood-based deep generative models.
Two-step procedures can recover the true data distribution and avoid overfitting.
The approach unifies and extends existing models like GANs for better density estimation.
Abstract
Likelihood-based, or explicit, deep generative models use neural networks to construct flexible high-dimensional densities. This formulation directly contradicts the manifold hypothesis, which states that observed data lies on a low-dimensional manifold embedded in high-dimensional ambient space. In this paper we investigate the pathologies of maximum-likelihood training in the presence of this dimensionality mismatch. We formally prove that degenerate optima are achieved wherein the manifold itself is learned but not the distribution on it, a phenomenon we call manifold overfitting. We propose a class of two-step procedures consisting of a dimensionality reduction step followed by maximum-likelihood density estimation, and prove that they recover the data-generating distribution in the nonparametric regime, thus avoiding manifold overfitting. We also show that these procedures enable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Music and Audio Processing · Human Pose and Action Recognition
