Training Energy-Based Models with Diffusion Contrastive Divergences
Weijian Luo, Hao Jiang, Tianyang Hu, Jiacheng Sun, Zhenguo, Li, Zhihua Zhang

TL;DR
This paper introduces Diffusion Contrastive Divergence (DCD), a new training method for Energy-Based Models that improves efficiency and performance over traditional Contrastive Divergence by replacing Langevin dynamics with diffusion processes.
Contribution
The paper proposes DCD, a novel divergence for training EBMs that is more computationally efficient and avoids complex gradient terms, extending the interpretation of CD.
Findings
DCD outperforms CD on synthetic data and image denoising tasks.
DCD enables effective training of EBMs for image generation on the CelebA dataset.
DCD achieves comparable results to existing EBMs in image generation.
Abstract
Energy-Based Models (EBMs) have been widely used for generative modeling. Contrastive Divergence (CD), a prevailing training objective for EBMs, requires sampling from the EBM with Markov Chain Monte Carlo methods (MCMCs), which leads to an irreconcilable trade-off between the computational burden and the validity of the CD. Running MCMCs till convergence is computationally intensive. On the other hand, short-run MCMC brings in an extra non-negligible parameter gradient term that is difficult to handle. In this paper, we provide a general interpretation of CD, viewing it as a special instance of our proposed Diffusion Contrastive Divergence (DCD) family. By replacing the Langevin dynamic used in CD with other EBM-parameter-free diffusion processes, we propose a more efficient divergence. We show that the proposed DCDs are both more computationally efficient than the CD and are not…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Neuroimaging Techniques and Applications · Model Reduction and Neural Networks
Methodsenergy-based model · Diffusion
