Statistical performance of support vector machines

Gilles Blanchard; Olivier Bousquet; Pascal Massart

arXiv:0804.0551·math.ST·December 18, 2008

Statistical performance of support vector machines

Gilles Blanchard, Olivier Bousquet, Pascal Massart

PDF

TL;DR

This paper analyzes the statistical properties of support vector machines (SVMs) using concentration theory and empirical processes, revealing conditions for optimal penalty selection and fast convergence rates.

Contribution

It interprets SVMs as a model selection procedure and derives oracle inequalities, providing new insights into penalty choices and convergence rates.

Findings

01

SVMs can be viewed as a regularization and model selection method.

02

Fast convergence rates for SVMs are achievable under certain conditions.

03

The study compares the actual penalty used in SVMs to the minimal penalty suggested by theory.

Abstract

The support vector machine (SVM) algorithm is well known to the computer learning community for its very good practical results. The goal of the present paper is to study this algorithm from a statistical perspective, using tools of concentration theory and empirical processes. Our main result builds on the observation made by other authors that the SVM can be viewed as a statistical regularization procedure. From this point of view, it can also be interpreted as a model selection principle using a penalized criterion. It is then possible to adapt general methods related to model selection in this framework to study two important points: (1) what is the minimum penalty and how does it compare to the penalty actually used in the SVM algorithm; (2) is it possible to obtain ``oracle inequalities'' in that setting, for the specific loss function used in the SVM algorithm? We show that the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.