L2M: Practical posterior Laplace approximation with optimization-driven   second moment estimation

Christian S. Perone; Roberto Pereira Silveira; Thomas Paula

arXiv:2107.04695·cs.LG·July 14, 2021·1 cites

L2M: Practical posterior Laplace approximation with optimization-driven second moment estimation

Christian S. Perone, Roberto Pereira Silveira, Thomas Paula

PDF

Open Access

TL;DR

L2M introduces a practical method for posterior Laplace approximation in neural networks by leveraging the gradient second moment, estimated during standard optimization, to enable uncertainty quantification without additional computational cost.

Contribution

The paper proposes a simple, efficient approach to posterior Laplace approximation using gradient second moments from common optimizers, eliminating the need for curvature matrix computation.

Findings

01

Method is easy to implement with minimal code changes.

02

No extra computational steps or hyperparameters are needed.

03

Provides reasonable uncertainty estimates in neural networks.

Abstract

Uncertainty quantification for deep neural networks has recently evolved through many techniques. In this work, we revisit Laplace approximation, a classical approach for posterior approximation that is computationally attractive. However, instead of computing the curvature matrix, we show that, under some regularity conditions, the Laplace approximation can be easily constructed using the gradient second moment. This quantity is already estimated by many exponential moving average variants of Adagrad such as Adam and RMSprop, but is traditionally discarded after training. We show that our method (L2M) does not require changes in models or optimization, can be implemented in a few lines of code to yield reasonable results, and it does not require any extra computational steps besides what is already being computed by optimizers, without introducing any new hyperparameter. We hope our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel Reduction and Neural Networks · Gaussian Processes and Bayesian Inference · Sparse and Compressive Sensing Techniques

MethodsAdam · AdaGrad