Adapting the Linearised Laplace Model Evidence for Modern Deep Learning
Javier Antor\'an, David Janz, James Urquhart Allingham, Erik, Daxberger, Riccardo Barbano, Eric Nalisnick, Jos\'e Miguel Hern\'andez-Lobato

TL;DR
This paper critically examines the linearised Laplace method for model uncertainty estimation in deep learning, identifying limitations with modern architectures and proposing adaptations supported by theory and experiments.
Contribution
It analyzes the assumptions of the linearised Laplace method, highlights issues with current deep learning tools, and offers practical recommendations for its adaptation.
Findings
The method interacts poorly with stochastic approximation and normalization layers.
Proposed adaptations improve model evidence estimation in modern architectures.
Empirical validation on diverse models confirms the effectiveness of the recommendations.
Abstract
The linearised Laplace method for estimating model uncertainty has received renewed attention in the Bayesian deep learning community. The method provides reliable error bars and admits a closed-form expression for the model evidence, allowing for scalable selection of model hyperparameters. In this work, we examine the assumptions behind this method, particularly in conjunction with model selection. We show that these interact poorly with some now-standard tools of deep learning--stochastic approximation methods and normalisation layers--and make recommendations for how to better adapt this classic method to the modern setting. We provide theoretical support for our recommendations and validate them empirically on MLPs, classic CNNs, residual networks with and without normalisation layers, generative autoencoders and transformers.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Model Reduction and Neural Networks · Generative Adversarial Networks and Image Synthesis
