Variance reduction properties of the reparameterization trick
Ming Xu, Matias Quiroz, Robert Kohn, Scott A. Sisson

TL;DR
This paper analyzes why the reparameterization trick reduces variance in gradient estimates within variational inference, under idealized assumptions, and demonstrates its effectiveness through real-world examples.
Contribution
It provides a theoretical explanation for the variance reduction of the reparameterization trick under specific assumptions and validates the findings with practical examples.
Findings
Reparameterization gradient estimators have smaller variance than score function estimators.
Theoretical analysis under Gaussian and quadratic assumptions explains variance reduction.
Empirical results support the theoretical insights in real-world scenarios.
Abstract
The reparameterization trick is widely used in variational inference as it yields more accurate estimates of the gradient of the variational objective than alternative approaches such as the score function method. Although there is overwhelming empirical evidence in the literature showing its success, there is relatively little research exploring why the reparameterization trick is so effective. We explore this under the idealized assumptions that the variational approximation is a mean-field Gaussian density and that the log of the joint density of the model parameters and the data is a quadratic function that depends on the variational mean. From this, we show that the marginal variances of the reparameterization gradient estimator are smaller than those of the score function gradient estimator. We apply the result of our idealized analysis to real-world examples.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMolecular Biology Techniques and Applications · Domain Adaptation and Few-Shot Learning · Statistical Methods and Inference
