Yes, but Did It Work?: Evaluating Variational Inference
Yuling Yao, Aki Vehtari, Daniel Simpson, Andrew Gelman

TL;DR
This paper introduces two diagnostic algorithms, PSIS and VSBC, to evaluate the quality of variational inference approximations, addressing the challenge of identifying problems with these approximations.
Contribution
It presents novel diagnostic tools that improve the assessment of variational inference accuracy and reliability.
Findings
PSIS provides a goodness-of-fit measure for joint distributions.
VSBC evaluates the average performance of point estimates.
Both diagnostics help identify issues in variational approximations.
Abstract
While it's always possible to compute a variational approximation to a posterior distribution, it can be difficult to discover problems with this approximation. We propose two diagnostic algorithms to alleviate this problem. The Pareto-smoothed importance sampling (PSIS) diagnostic gives a goodness of fit measurement for joint distributions, while simultaneously improving the error in the estimate. The variational simulation-based calibration (VSBC) assesses the average performance of point estimates.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods · Evaluation and Performance Assessment
