The generalization error of max-margin linear classifiers: Benign   overfitting and high dimensional asymptotics in the overparametrized regime

Andrea Montanari; Feng Ruan; Youngtak Sohn; Jun Yan

arXiv:1911.01544·math.ST·March 23, 2023·90 cites

The generalization error of max-margin linear classifiers: Benign overfitting and high dimensional asymptotics in the overparametrized regime

Andrea Montanari, Feng Ruan, Youngtak Sohn, Jun Yan

PDF

Open Access

TL;DR

This paper analyzes the generalization error of max-margin linear classifiers in high-dimensional settings, revealing conditions for benign overfitting and providing exact error formulas, with implications for neural network features.

Contribution

It derives exact asymptotic expressions for generalization error in high-dimensional max-margin classification and identifies conditions for benign overfitting, extending understanding beyond linear models.

Findings

01

Exact formulas for generalization error in high-dimensional regimes

02

Conditions for benign overfitting in max-margin classifiers

03

Application to neural network feature representations

Abstract

Modern machine learning classifiers often exhibit vanishing classification error on the training set. They achieve this by learning nonlinear representations of the inputs that maps the data into linearly separable classes. Motivated by these phenomena, we revisit high-dimensional maximum margin classification for linearly separable data. We consider a stylized setting in which data $(y_{i}, x_{i})$ , $i \leq n$ are i.i.d. with $x_{i} \sim N (0, Σ)$ a $p$ -dimensional Gaussian feature vector, and $y_{i} \in {+ 1, - 1}$ a label whose distribution depends on a linear combination of the covariates $⟨ θ_{*}, x_{i} ⟩$ . While the Gaussian model might appear extremely simplistic, universality arguments can be used to show that the results derived in this setting also apply to the output of certain…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Machine Learning and ELM · Stochastic Gradient Optimization Techniques