Bayes without Underfitting: Fully Correlated Deep Learning Posteriors   via Alternating Projections

Marco Miani; Hrittik Roy; S{\o}ren Hauberg

arXiv:2410.16901·cs.LG·October 23, 2024

Bayes without Underfitting: Fully Correlated Deep Learning Posteriors via Alternating Projections

Marco Miani, Hrittik Roy, S{\o}ren Hauberg

PDF

Open Access

TL;DR

This paper introduces a novel Bayesian deep learning method that avoids underfitting by projecting onto the null space of the generalized Gauss-Newton matrix, ensuring accurate uncertainty quantification without sacrificing predictive accuracy.

Contribution

The authors propose a null space projection technique for Bayesian approximations in deep learning, scalable to large models, which guarantees no underfitting in Bayesian predictions.

Findings

01

Scales to models with 28 million parameters.

02

Maintains predictive accuracy while providing uncertainty estimates.

03

Effective for large vision transformer models.

Abstract

Bayesian deep learning all too often underfits so that the Bayesian prediction is less accurate than a simple point estimate. Uncertainty quantification then comes at the cost of accuracy. For linearized models, the null space of the generalized Gauss-Newton matrix corresponds to parameters that preserve the training predictions of the point estimate. We propose to build Bayesian approximations in this null space, thereby guaranteeing that the Bayesian predictive does not underfit. We suggest a matrix-free algorithm for projecting onto this null space, which scales linearly with the number of parameters and quadratically with the number of output dimensions. We further propose an approximation that only scales linearly with parameters to make the method applicable to generative models. An extensive empirical evaluation shows that the approach scales to large models, including vision…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Neural Networks and Applications · Generative Adversarial Networks and Image Synthesis