Sample Complexity of Agnostic Multiclass Classification: Natarajan Dimension Strikes Back
Alon Cohen, Liad Erez, Steve Hanneke, Tomer Koren, Yishay Mansour, Shay Moran, Qian Zhang

TL;DR
This paper reveals that the sample complexity of agnostic multiclass classification depends on two dimensions, DS and Natarajan, providing nearly tight bounds and highlighting a fundamental difference from binary classification.
Contribution
It introduces a new bound involving both DS and Natarajan dimensions for agnostic multiclass PAC learning, showing the necessity of two parameters.
Findings
Sample complexity bounds involve DS and Natarajan dimensions.
The bounds are nearly tight up to logarithmic factors.
Multiclass learning inherently involves two structural parameters.
Abstract
The fundamental theorem of statistical learning states that binary PAC learning is governed by a single parameter -- the Vapnik-Chervonenkis (VC) dimension -- which determines both learnability and sample complexity. Extending this to multiclass classification has long been challenging, since Natarajan's work in the late 80s proposing the Natarajan dimension (Nat) as a natural analogue of VC. Daniely and Shalev-Shwartz (2014) introduced the DS dimension, later shown by Brukhim et al. (2022) to characterize multiclass learnability. Brukhim et al. also showed that Nat and DS can diverge arbitrarily, suggesting that multiclass learning is governed by DS rather than Nat. We show that agnostic multiclass PAC sample complexity is in fact governed by two distinct dimensions. Specifically, we prove nearly tight agnostic sample complexity bounds that, up to log factors, take the form…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Stochastic Gradient Optimization Techniques · Imbalanced Data Classification Techniques
