Annealing in variational inference mitigates mode collapse: A theoretical study on Gaussian mixtures
Luigi Fogliani, Bruno Loureiro, Marylou Gabri\'e

TL;DR
This paper provides a mathematical analysis of annealing strategies in variational inference for Gaussian mixtures, showing how proper annealing can prevent mode collapse and offering insights applicable to neural network models.
Contribution
It offers a theoretical characterization of annealing effects on mode collapse in Gaussian mixtures and extends insights to neural network models for practical variational inference.
Findings
Proper annealing prevents mode collapse in Gaussian mixtures.
A sharp formula for mode collapse probability is derived.
Numerical evidence supports extension to neural network models.
Abstract
Mode collapse, the failure to capture one or more modes when targetting a multimodal distribution, is a central challenge in modern variational inference. In this work, we provide a mathematical analysis of annealing based strategies for mitigating mode collapse in a tractable setting: learning a Gaussian mixture, where mode collapse is known to arise. Leveraging a low dimensional summary statistics description, we precisely characterize the interplay between the initial temperature and the annealing rate, and derive a sharp formula for the probability of mode collapse. Our analysis shows that an appropriately chosen annealing scheme can robustly prevent mode collapse. Finally, we present numerical evidence that these theoretical tradeoffs qualitatively extend to neural network based models, RealNVP normalizing flows, providing guidance for designing annealing strategies mitigating mode…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Neural Networks and Reservoir Computing · Gaussian Processes and Bayesian Inference
