Universality of max-margin classifiers

Andrea Montanari; Feng Ruan; Basil Saeed; Youngtak Sohn

arXiv:2310.00176·math.ST·October 3, 2023·2 cites

Universality of max-margin classifiers

Andrea Montanari, Feng Ruan, Basil Saeed, Youngtak Sohn

PDF

Open Access

TL;DR

This paper investigates the behavior of max-margin classifiers in high-dimensional settings, demonstrating that their asymptotic properties depend only on feature covariances, leading to universal results that simplify analysis.

Contribution

It establishes a universality result showing that max-margin classifier behavior depends solely on feature covariances, applicable to non-Gaussian features and complex featurization maps.

Findings

01

Overparametrization threshold can be computed via a Gaussian model.

02

Generalization error depends only on feature covariance structure.

03

Support vector count scales proportionally with sample size in high dimensions.

Abstract

Maximum margin binary classification is one of the most fundamental algorithms in machine learning, yet the role of featurization maps and the high-dimensional asymptotics of the misclassification error for non-Gaussian features are still poorly understood. We consider settings in which we observe binary labels $y_{i}$ and either $d$ -dimensional covariates $z_{i}$ that are mapped to a $p$ -dimension space via a randomized featurization map $ϕ : R^{d} \to R^{p}$ , or $p$ -dimensional features of non-Gaussian independent entries. In this context, we study two fundamental questions: $(i)$ At what overparametrization ratio $p / n$ do the data become linearly separable? $(ii)$ What is the generalization error of the max-margin classifier? Working in the high-dimensional regime in which the number of features $p$ , the number of samples $n$ and the input…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Bayesian Methods and Mixture Models · Machine Learning and Data Classification