Clarifying MCMC-based training of modern EBMs : Contrastive Divergence   versus Maximum Likelihood

L\'eo Gagnon; Guillaume Lajoie

arXiv:2202.12176·cs.LG·February 25, 2022

Clarifying MCMC-based training of modern EBMs : Contrastive Divergence versus Maximum Likelihood

L\'eo Gagnon, Guillaume Lajoie

PDF

Open Access

TL;DR

This paper clarifies the theoretical foundations of MCMC-based training for Energy-Based Models, contrasting Contrastive Divergence with Maximum Likelihood, and offers new interpretations and experimental insights.

Contribution

It provides a first-principles explanation of MCMC training, critiques existing methods, and introduces a new interpretation of popular algorithms in EBMs.

Findings

01

Existing algorithms are not true Contrastive Divergence.

02

New interpretation clarifies theoretical misunderstandings.

03

Experimental results support the proposed reinterpretation.

Abstract

The Energy-Based Model (EBM) framework is a very general approach to generative modeling that tries to learn and exploit probability distributions only defined though unnormalized scores. It has risen in popularity recently thanks to the impressive results obtained in image generation by parameterizing the distribution with Convolutional Neural Networks (CNN). However, the motivation and theoretical foundations behind modern EBMs are often absent from recent papers and this sometimes results in some confusion. In particular, the theoretical justifications behind the popular MCMC-based learning algorithm Contrastive Divergence (CD) are often glossed over and we find that this leads to theoretical errors in recent influential papers (Du & Mordatch, 2019; Du et al., 2020). After offering a first-principles introduction of MCMC-based training, we argue that the learning algorithm they use…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Energy Load and Power Forecasting · Gaussian Processes and Bayesian Inference