An Empirical Analysis of the Advantages of Finite- v.s. Infinite-Width   Bayesian Neural Networks

Jiayu Yao; Yaniv Yacoby; Beau Coker; Weiwei Pan; Finale Doshi-Velez

arXiv:2211.09184·stat.ML·November 29, 2022

An Empirical Analysis of the Advantages of Finite- v.s. Infinite-Width Bayesian Neural Networks

Jiayu Yao, Yaniv Yacoby, Beau Coker, Weiwei Pan, Finale Doshi-Velez

PDF

Open Access

TL;DR

This paper empirically compares finite- and infinite-width Bayesian neural networks, revealing that finite-width models often generalize better under model mismatch due to their spectral properties.

Contribution

It provides the first comprehensive empirical analysis of how finite- and infinite-width BNNs differ in performance and generalization, especially under model misspecification.

Findings

01

Finite-width BNNs can outperform infinite-width ones when models are mis-specified.

02

Model mismatch can negatively impact the performance of wider BNNs.

03

Finite-width BNNs' spectral properties enable better adaptation under certain conditions.

Abstract

Comparing Bayesian neural networks (BNNs) with different widths is challenging because, as the width increases, multiple model properties change simultaneously, and, inference in the finite-width case is intractable. In this work, we empirically compare finite- and infinite-width BNNs, and provide quantitative and qualitative explanations for their performance difference. We find that when the model is mis-specified, increasing width can hurt BNN performance. In these cases, we provide evidence that finite-width BNNs generalize better partially due to the properties of their frequency spectrum that allows them to adapt under model mismatch.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Machine Learning and Data Classification · Anomaly Detection Techniques and Applications