Regularization and Optimal Multiclass Learning
Julian Asilis, Siddartha Devic, Shaddin Dughmi, Vatsal Sharan,, Shang-Hua Teng

TL;DR
This paper characterizes the role of regularization in multiclass learning where ERM fails, introducing optimal algorithms that incorporate local regularization and unsupervised learning, and establishing new combinatorial complexity measures.
Contribution
It introduces a new optimal learning framework for multiclass problems using one-inclusion graphs, including local regularization and unsupervised learning, and defines the Hall complexity for error rate characterization.
Findings
Optimal learners using local regularization and unsupervised learning.
Hall complexity characterizes transductive error rates exactly.
Extension of OIGs to agnostic learning with error characterization.
Abstract
The quintessential learning algorithm of empirical risk minimization (ERM) is known to fail in various settings for which uniform convergence does not characterize learning. It is therefore unsurprising that the practice of machine learning is rife with considerably richer algorithmic techniques for successfully controlling model capacity. Nevertheless, no such technique or principle has broken away from the pack to characterize optimal learning in these more general settings. The purpose of this work is to characterize the role of regularization in perhaps the simplest setting for which ERM fails: multiclass learning with arbitrary label sets. Using one-inclusion graphs (OIGs), we exhibit optimal learning algorithms that dovetail with tried-and-true algorithmic principles: Occam's Razor as embodied by structural risk minimization (SRM), the principle of maximum entropy, and Bayesian…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Domain Adaptation and Few-Shot Learning · Machine Learning and Data Classification
Methodsfail
