A Characterization of Mean Squared Error for Estimator with Bagging
Martin Mihelich, Charles Dognin, Yan Shu, Michael Blot

TL;DR
This paper provides a theoretical analysis of how bagging reduces the Mean Squared Error (MSE) for estimators, especially variance estimators, revealing conditions under which bagging improves or worsens performance.
Contribution
It proves that increasing the number of bagged estimators always reduces MSE and derives an exact MSE expression for variance estimators, highlighting the role of kurtosis.
Findings
Increasing bagged estimators N reduces MSE.
Bagging improves variance estimation only if kurtosis > 1.5.
Proposes a new algorithm for high-precision variance estimation.
Abstract
Bagging can significantly improve the generalization performance of unstable machine learning algorithms such as trees or neural networks. Though bagging is now widely used in practice and many empirical studies have explored its behavior, we still know little about the theoretical properties of bagged predictions. In this paper, we theoretically investigate how the bagging method can reduce the Mean Squared Error (MSE) when applied on a statistical estimator. First, we prove that for any estimator, increasing the number of bagged estimators in the average can only reduce the MSE. This intuitive result, observed empirically and discussed in the literature, has not yet been rigorously proved. Second, we focus on the standard estimator of variance called unbiased sample variance and we develop an exact analytical expression of the MSE for this estimator with bagging. This allows us…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Machine Learning and Data Classification · Gaussian Processes and Bayesian Inference
