Precise Asymptotics of Bagging Regularized M-estimators

Takuya Koriyama; Pratik Patil; Jin-Hong Du; Kai Tan; Pierre C. Bellec

arXiv:2409.15252·math.ST·September 30, 2025

Precise Asymptotics of Bagging Regularized M-estimators

Takuya Koriyama, Pratik Patil, Jin-Hong Du, Kai Tan, Pierre C. Bellec

PDF

Open Access

TL;DR

This paper provides a detailed asymptotic analysis of the prediction risk for ensemble regularized M-estimators obtained via subagging, revealing how ensemble size and subsample size influence regularization effects.

Contribution

It introduces a new asymptotic framework for analyzing the joint behavior of ensemble regularized estimators with varying subsample sizes and regularization, extending previous results to more general settings.

Findings

01

Optimal subsample size tends to be in the overparameterized regime.

02

Ensemble size and subsample size jointly influence regularization effects.

03

Joint optimization can outperform regularization alone on full data.

Abstract

We characterize the squared prediction risk of ensemble estimators obtained through subagging (subsample bootstrap aggregating) regularized M-estimators and construct a consistent estimator for the risk. Specifically, we consider a heterogeneous collection of $M \geq 1$ regularized M-estimators, each trained with (possibly different) subsample sizes, convex differentiable losses, and convex regularizers. We operate under the proportional asymptotics regime, where the sample size $n$ , feature size $p$ , and subsample sizes $k_{m}$ for $m \in [M]$ all diverge with fixed limiting ratios $n / p$ and $k_{m} / n$ . Key to our analysis is a new result on the joint asymptotic behavior of correlations between the estimator and residual errors on overlapping subsamples, governed through a (provably) contractive nonlinear system of equations. Of independent interest, we also establish convergence of trace…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference