Fast yet Simple Natural-Gradient Descent for Variational Inference in   Complex Models

Mohammad Emtiyaz Khan; Didrik Nielsen

arXiv:1807.04489·stat.ML·August 3, 2018

Fast yet Simple Natural-Gradient Descent for Variational Inference in Complex Models

Mohammad Emtiyaz Khan, Didrik Nielsen

PDF

1 Repo

TL;DR

This paper introduces a fast and simple natural-gradient descent method for variational inference in complex models, leveraging exponential-family duality to improve convergence and local approximation accuracy.

Contribution

It proposes a novel natural-gradient update method that is both computationally efficient and effective for complex models, especially in deep learning contexts.

Findings

01

Natural-gradient methods outperform standard gradient methods in convergence speed.

02

The proposed approach effectively captures local approximations for model components.

03

Empirical results demonstrate improved Bayesian deep learning performance.

Abstract

Bayesian inference plays an important role in advancing machine learning, but faces computational challenges when applied to complex models such as deep neural networks. Variational inference circumvents these challenges by formulating Bayesian inference as an optimization problem and solving it using gradient-based optimization. In this paper, we argue in favor of natural-gradient approaches which, unlike their gradient-based counterparts, can improve convergence by exploiting the information geometry of the solutions. We show how to derive fast yet simple natural-gradient updates by using a duality associated with exponential-family distributions. An attractive feature of these methods is that, by using natural-gradients, they are able to extract accurate local approximations for individual model components. We summarize recent results for Bayesian deep learning showing the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ssggreg/active_learning
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.