Scalable Linearized Laplace Approximation via Surrogate Neural Kernel
Luis A. Ortega, Sim\'on Rodr\'iguez-Santana, Daniel Hern\'andez-Lobato

TL;DR
This paper presents a scalable approach to approximate the Linearized Laplace Approximation kernel using a surrogate neural network that learns a compact feature space, enabling efficient uncertainty estimation on large-scale pre-trained neural networks.
Contribution
It introduces a novel surrogate neural kernel method that avoids large Jacobian computations, improving scalability and accuracy in uncertainty estimation for pre-trained DNNs.
Findings
Achieves similar or better uncertainty calibration compared to existing methods.
Enhances out-of-distribution detection by biasing the learned kernel.
Enables efficient computation of predictive uncertainty on large-scale models.
Abstract
We introduce a scalable method to approximate the kernel of the Linearized Laplace Approximation (LLA). For this, we use a surrogate deep neural network (DNN) that learns a compact feature representation whose inner product replicates the Neural Tangent Kernel (NTK). This avoids the need to compute large Jacobians. Training relies solely on efficient Jacobian-vector products, allowing to compute predictive uncertainty on large-scale pre-trained DNNs. Experimental results show similar or improved uncertainty estimation and calibration compared to existing LLA approximations. Notwithstanding, biasing the learned kernel significantly enhances out-of-distribution detection. This remarks the benefits of the proposed method for finding better kernels than the NTK in the context of LLA to compute prediction uncertainty given a pre-trained DNN.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Gaussian Processes and Bayesian Inference · Model Reduction and Neural Networks
