GANs as Gradient Flows that Converge
Yu-Jui Huang, Yuchong Zhang

TL;DR
This paper presents a novel perspective on GANs by modeling them as gradient flows in the space of probability distributions, revealing their convergence properties and divergence causes through a connection with distribution-dependent ODEs and Fokker-Planck equations.
Contribution
It establishes a theoretical framework linking GAN training to gradient flows of probability distributions, providing insights into convergence and divergence phenomena.
Findings
GANs can be viewed as gradient flows of probability distributions.
The divergence of GANs is linked to MSE minimization between sample sets.
A unique solution to the distribution-dependent ODE is constructed and analyzed.
Abstract
This paper approaches the unsupervised learning problem by gradient descent in the space of probability density functions. A main result shows that along the gradient flow induced by a distribution-dependent ordinary differential equation (ODE), the unknown data distribution emerges as the long-time limit. That is, one can uncover the data distribution by simulating the distribution-dependent ODE. Intriguingly, the simulation of the ODE is shown equivalent to the training of generative adversarial networks (GANs). This equivalence provides a new "cooperative" view of GANs and, more importantly, sheds new light on the divergence of GANs. In particular, it reveals that the GAN algorithm implicitly minimizes the mean squared error (MSE) between two sets of samples, and this MSE fitting alone can cause GANs to diverge. To construct a solution to the distribution-dependent ODE, we first show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Generative Adversarial Networks and Image Synthesis
