A Stein variational Newton method

Gianluca Detommaso; Tiangang Cui; Alessio Spantini; Youssef Marzouk; and Robert Scheichl

arXiv:1806.03085·stat.ML·October 31, 2018·42 cites

A Stein variational Newton method

Gianluca Detommaso, Tiangang Cui, Alessio Spantini, Youssef Marzouk, and Robert Scheichl

PDF

Open Access 2 Repos

TL;DR

This paper enhances the Stein variational gradient descent (SVGD) algorithm by incorporating second-order information, leading to faster convergence and improved kernel selection, demonstrated through multiple test cases.

Contribution

It introduces a Newton-like method for SVGD that accelerates convergence and improves kernel choices by leveraging second-order information.

Findings

01

Significant computational gains over original SVGD.

02

Effective kernel selection through second-order information.

03

Accelerated convergence demonstrated in multiple tests.

Abstract

Stein variational gradient descent (SVGD) was recently proposed as a general purpose nonparametric variational inference algorithm [Liu & Wang, NIPS 2016]: it minimizes the Kullback-Leibler divergence between the target distribution and its approximation by implementing a form of functional gradient descent on a reproducing kernel Hilbert space. In this paper, we accelerate and generalize the SVGD algorithm by including second-order information, thereby approximating a Newton-like iteration in function space. We also show how second-order information can lead to more effective choices of kernel. We observe significant computational gains over the original SVGD algorithm in multiple test cases.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMarkov Chains and Monte Carlo Methods · Gaussian Processes and Bayesian Inference · Model Reduction and Neural Networks