The Theory Behind Overfitting, Cross Validation, Regularization,   Bagging, and Boosting: Tutorial

Benyamin Ghojogh; Mark Crowley

arXiv:1905.12787·stat.ML·May 23, 2023·73 cites

The Theory Behind Overfitting, Cross Validation, Regularization, Bagging, and Boosting: Tutorial

Benyamin Ghojogh, Mark Crowley

PDF

Open Access

TL;DR

This tutorial comprehensively explains the theoretical foundations of overfitting, cross validation, regularization, bagging, and boosting, including their mathematical formulations, error bounds, and practical examples across machine learning models.

Contribution

It provides a unified theoretical framework for understanding key ensemble and regularization techniques, linking them through bias-variance analysis and error bounds.

Findings

01

Overfitting is characterized by high variance and bias.

02

Bagging reduces variance in estimators.

03

Boosting improves generalization by combining weak learners.

Abstract

In this tutorial paper, we first define mean squared error, variance, covariance, and bias of both random variables and classification/predictor models. Then, we formulate the true and generalization errors of the model for both training and validation/test instances where we make use of the Stein's Unbiased Risk Estimator (SURE). We define overfitting, underfitting, and generalization using the obtained true and generalization errors. We introduce cross validation and two well-known examples which are $K$ -fold and leave-one-out cross validations. We briefly introduce generalized cross validation and then move on to regularization where we use the SURE again. We work on both $ℓ_{2}$ and $ℓ_{1}$ norm regularizations. Then, we show that bootstrap aggregating (bagging) reduces the variance of estimation. Boosting, specifically AdaBoost, is introduced and it is explained as both an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Sparse and Compressive Sensing Techniques · Adversarial Robustness in Machine Learning

MethodsEarly Stopping · Support Vector Machine