Accelerated Linearized Laplace Approximation for Bayesian Deep Learning
Zhijie Deng, Feng Zhou, Jun Zhu

TL;DR
This paper introduces a Nystrom-based acceleration technique for Linearized Laplace Approximation in Bayesian deep learning, improving scalability and performance while maintaining theoretical guarantees.
Contribution
The paper proposes a novel Nystrom approximation for NTKs to speed up LLA, enabling scalable Bayesian neural network inference with minimal fidelity loss.
Findings
Method scales to large architectures like vision transformers
Achieves improved computational efficiency
Maintains high fidelity in Bayesian inference
Abstract
Laplace approximation (LA) and its linearized variant (LLA) enable effortless adaptation of pretrained deep neural networks to Bayesian neural networks. The generalized Gauss-Newton (GGN) approximation is typically introduced to improve their tractability. However, LA and LLA are still confronted with non-trivial inefficiency issues and should rely on Kronecker-factored, diagonal, or even last-layer approximate GGN matrices in practical use. These approximations are likely to harm the fidelity of learning outcomes. To tackle this issue, inspired by the connections between LLA and neural tangent kernels (NTKs), we develop a Nystrom approximation to NTKs to accelerate LLA. Our method benefits from the capability of popular deep learning libraries for forward mode automatic differentiation, and enjoys reassuring theoretical guarantees. Extensive studies reflect the merits of the proposed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Bayesian Modeling and Causal Inference · Machine Learning and Data Classification
