Occam's Razor is Only as Sharp as Your ELBO

Ethan Harvey; Michael C. Hughes

arXiv:2604.25984·stat.ML·April 30, 2026

Occam's Razor is Only as Sharp as Your ELBO

Ethan Harvey, Michael C. Hughes

PDF

TL;DR

This paper examines how the evidence and ELBO influence model selection, revealing that ELBO can lead to overfitting in over-parameterized models and that reduced-rank assumptions impact model selection.

Contribution

It demonstrates that ELBO-based hyperparameter learning can cause overfitting and highlights the limitations of using ELBO for model selection in large, over-parameterized models.

Findings

01

ELBO can lead to overfitting depending on the covariance rank.

02

Bayesian model selection may prefer overfit models over underfit ones.

03

Reduced-rank assumptions impact the effectiveness of model selection.

Abstract

The marginal likelihood, also known as the evidence, is regarded as a mathematical embodiment of Occam's razor, enabling model selection that avoids overfitting. The evidence lower bound (ELBO) objective from variational inference has also been used for similar purposes. Prior work has shown that restricting the approximate posterior family via a mean-field approximation can lead the ELBO to underfit. In this paper, we show how ELBO-based hyperparameter learning in a simple over-parameterized regression model can also produce overfitting, depending on the assumed rank of the covariance matrix in a Gaussian approximate posterior. Surprisingly, among only the underfit and overfit options, Bayesian model selection via the evidence itself sometimes prefers the overfit version, while the ELBO does not. Bayesian practitioners hoping to scale to large models should be cautious about how…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.