Importance Weighted Variational Inference without the Reparameterization Trick
Kam\'elia Daudel, Minh-Ngoc Tran, Cheng Zhang

TL;DR
This paper analyzes importance weighted variational inference without reparameterization, identifies issues with existing estimators, and introduces a new VIMCO-star estimator that improves gradient signal-to-noise ratio and empirical performance.
Contribution
It provides the first theoretical analysis of REINFORCE-based estimators in importance weighted VI and proposes a novel VIMCO-star estimator to address SNR collapse.
Findings
VIMCO estimators have vanishing SNR as N increases
VIMCO-star achieves sqrt(N) SNR scaling
Empirical results show VIMCO-star outperforms existing methods
Abstract
Importance weighted variational inference (VI) approximates densities known up to a normalizing constant by optimizing bounds that tighten with the number of Monte Carlo samples . Standard optimization relies on reparameterized gradient estimators, which are well-studied theoretically yet restrict both the choice of the data-generating process and the variational approximation. While REINFORCE gradient estimators do not suffer from such restrictions, they lack rigorous theoretical justification. In this paper, we provide the first comprehensive analysis of REINFORCE gradient estimators in importance weighted VI, leveraging this theoretical foundation to diagnose and resolve fundamental deficiencies in current state-of-the-art estimators. Specifically, we introduce and examine a generalized family of variational inference for Monte Carlo objectives (VIMCO) gradient estimators. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Stochastic Gradient Optimization Techniques · Gaussian Processes and Bayesian Inference
