Theoretical Limitations of Ensembles in the Age of Overparameterization
Niclas Dern, John P. Cunningham, Geoff Pleiss

TL;DR
This paper theoretically analyzes overparameterized neural network ensembles, showing they behave similarly to single large models and questioning their assumed advantages in modern deep learning contexts.
Contribution
It provides a theoretical framework demonstrating that overparameterized ensembles are equivalent to single models, challenging traditional beliefs about ensemble benefits.
Findings
Overparameterized ensembles converge to single models with the same capacity.
Ensemble variance reflects capacity effects, not uncertainty.
Ensembles offer no inherent generalization advantage in overparameterized regimes.
Abstract
Classic ensembles generalize better than any single component model. In contrast, recent empirical studies find that modern ensembles of (overparameterized) neural networks may not provide any inherent generalization advantage over single but larger neural networks. This paper clarifies how modern overparameterized ensembles differ from their classic underparameterized counterparts, using ensembles of random feature (RF) regressors as a basis for developing theory. In contrast to the underparameterized regime, where ensembling typically induces regularization and increases generalization, we prove with minimal assumptions that infinite ensembles of overparameterized RF regressors become pointwise equivalent to (single) infinite-width RF regressors, and finite width ensembles rapidly converge to single models with the same parameter budget. These results, which are exact for ridgeless…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsOpinion Dynamics and Social Influence · Theoretical and Computational Physics
MethodsDeep Ensembles
