Differentiable Annealed Importance Sampling and the Perils of Gradient Noise
Guodong Zhang, Kyle Hsu, Jianing Li, Chelsea Finn, Roger Grosse

TL;DR
This paper introduces Differentiable AIS (DAIS), a variant of annealed importance sampling that is fully differentiable and compatible with mini-batch gradients, enabling gradient-based optimization of marginal likelihoods.
Contribution
The paper proposes DAIS, which removes Metropolis-Hastings corrections for differentiability, provides convergence analysis for Bayesian linear regression, and reveals limitations of stochastic DAIS with mini-batch gradients.
Findings
DAIS is consistent in the full-batch setting with a sublinear convergence rate.
Stochastic DAIS can perform arbitrarily poorly due to gradient noise.
Gradient noise in stochastic DAIS cannot be mitigated by smaller steps, unlike in other methods.
Abstract
Annealed importance sampling (AIS) and related algorithms are highly effective tools for marginal likelihood estimation, but are not fully differentiable due to the use of Metropolis-Hastings correction steps. Differentiability is a desirable property as it would admit the possibility of optimizing marginal likelihood as an objective using gradient-based methods. To this end, we propose Differentiable AIS (DAIS), a variant of AIS which ensures differentiability by abandoning the Metropolis-Hastings corrections. As a further advantage, DAIS allows for mini-batch gradients. We provide a detailed convergence analysis for Bayesian linear regression which goes beyond previous analyses by explicitly accounting for the sampler not having reached equilibrium. Using this analysis, we prove that DAIS is consistent in the full-batch setting and provide a sublinear convergence rate. Furthermore,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMarkov Chains and Monte Carlo Methods · Bayesian Methods and Mixture Models · Gaussian Processes and Bayesian Inference
MethodsLinear Regression
