Optimality of Sub-network Laplace Approximations: New Results and Methods

Swarnali Raha; Kshitij Khare; Rohit K Patra

arXiv:2605.09075·stat.ML·May 12, 2026

Optimality of Sub-network Laplace Approximations: New Results and Methods

Swarnali Raha, Kshitij Khare, Rohit K Patra

PDF

TL;DR

This paper provides a theoretical analysis of sub-network Laplace approximations in neural networks, revealing their bias in variance estimation and proposing two principled, optimal methods for subset selection with strong empirical performance.

Contribution

It introduces a rigorous analysis of sub-network Laplace methods, proving their variance underestimation and proposing Gradient-Laplace and Greedy-Laplace for optimal subset selection.

Findings

01

Sub-network Laplace methods systematically underestimate predictive variance.

02

The bias decreases as the sub-matrix size increases.

03

Proposed methods outperform existing heuristics in numerical experiments.

Abstract

Although the Laplace approximation offers a simple route to uncertainty quantification in deep neural networks, its reliance on inverting large Hessian matrices has motivated a range of computationally feasible low-dimensional or sparse approximations. A prominent class of such methods - sub-network Laplace approximations, constructs surrogates by restricting attention to a small subset of parameters. Existing approaches in this family typically rely on diagonal, layer-wise, or other architectural heuristics for subset selection, which ignore cross-parameter interactions and lack formal optimality guarantees. In this paper, we provide a rigorous theoretical analysis of the sub-network Laplace paradigm. We prove that all sub-network Laplace methods systematically underestimate the predictive variance of the full Laplace posterior, and that this bias decreases monotonically as the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.