Double Descent and Emergent Smoothing in Model Averaging Prediction

Ke Chen; Dandan Jiang; and Xinyu Zhang

arXiv:2605.13203·stat.ME·May 14, 2026

Double Descent and Emergent Smoothing in Model Averaging Prediction

Ke Chen, Dandan Jiang, and Xinyu Zhang

PDF

TL;DR

This paper explores the double descent phenomenon in high-dimensional linear model averaging, revealing an emergent smoothing effect that stabilizes predictions and proposing the LaMA method for improved accuracy.

Contribution

It uncovers the double descent behavior in model averaging, characterizes the risk landscape using random matrix theory, and introduces LaMA, a new method balancing bias and variance for better predictions.

Findings

01

Double descent occurs in model averaging near the interpolation boundary.

02

Weighted aggregation induces a smoothing effect that reduces risk divergence.

03

LaMA outperforms existing methods in high-dimensional prediction tasks.

Abstract

This paper investigates the predictive performance of model averaging in high-dimensional linear regression where the number of regressors is comparable to the sample size. We demonstrate that the double descent trajectory manifests within the model averaging framework, where the ensemble inherits the variance explosion of individual models near the interpolation boundary. However, we reveal that weighted aggregation simultaneously triggers an emergent smoothing effect that structurally suppresses the localized risk divergence, indicating that strategic weight choice serves as a vital stabilizing mechanism. Leveraging tools from random matrix theory, we derive the exact limiting out-of-sample risk under a nested model setting and provide a comprehensive characterization of the risk landscape. Building on these asymptotic results, we propose the Large Model Averaging (LaMA) method, which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.