EBMs Trained with Maximum Likelihood are Generator Models Trained with a Self-adverserial Loss
Zhisheng Xiao, Qing Yan, Yali Amit

TL;DR
This paper reveals that training Energy-based models with maximum likelihood is effectively a self-adversarial process akin to GANs, challenging traditional views on the role of MCMC sampling in EBM training.
Contribution
The study introduces a deterministic perspective on EBM training, connecting it with generator models and GANs, and clarifies the impact of noise in the dynamics.
Findings
Reintroducing noise reduces generator quality.
EBM training is a self-adversarial process, not pure maximum likelihood.
Deterministic solutions of the gradient flow provide new insights into EBM training.
Abstract
Maximum likelihood estimation is widely used in training Energy-based models (EBMs). Training requires samples from an unnormalized distribution, which is usually intractable, and in practice, these are obtained by MCMC algorithms such as Langevin dynamics. However, since MCMC in high-dimensional space converges extremely slowly, the current understanding of maximum likelihood training, which assumes approximate samples from the model can be drawn, is problematic. In this paper, we try to understand this training procedure by replacing Langevin dynamics with deterministic solutions of the associated gradient descent ODE. Doing so allows us to study the density induced by the dynamics (if the dynamics are invertible), and connect with GANs by treating the dynamics as generator models, the initial values as latent variables and the loss as optimizing a critic defined by the very same…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks · Gaussian Processes and Bayesian Inference
Methodsenergy-based model
