Learning curves for Soft Margin Classifiers
Sebastian Risau-Gusman, Mirta B. Gordon

TL;DR
This paper analytically studies the learning curves of Soft Margin Classifiers (SMCs) for both realizable and unrealizable tasks, revealing how their generalization errors decay with training set size and how to optimize hyperparameters for best performance.
Contribution
It derives analytical expressions for SMC learning curves using Statistical Mechanics, highlighting the impact of geometrical properties and hyperparameter tuning on generalization performance.
Findings
Optimal hyperparameter tuning improves SMC performance.
SMCs outperform hard margin SVMs on the same tasks.
Learning curves exhibit different decay laws depending on task realizability.
Abstract
Typical learning curves for Soft Margin Classifiers (SMCs) learning both realizable and unrealizable tasks are determined using the tools of Statistical Mechanics. We derive the analytical behaviour of the learning curves in the regimes of small and large training sets. The generalization errors present different decay laws towards the asymptotic values as a function of the training set size, depending on general geometrical characteristics of the rule to be learned. Optimal generalization curves are deduced through a fine tuning of the hyperparameter controlling the trade-off between the error and the regularization terms in the cost function. Even if the task is realizable, the optimal performance of the SMC is better than that of a hard margin Support Vector Machine (SVM) learning the same rule, and is very close to that of the Bayesian classifier.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Fault Detection and Control Systems · Control Systems and Identification
