Statistical Query Hardness of Multiclass Linear Classification with Random Classification Noise
Ilias Diakonikolas, Mingchen Ma, Lisheng Ren, Christos Tzamos

TL;DR
This paper investigates the computational complexity of multiclass linear classification with random classification noise, revealing super-polynomial statistical query lower bounds for three or more labels, indicating inherent hardness in this setting.
Contribution
The paper establishes the first super-polynomial statistical query lower bounds for multiclass linear classification with noise when there are three or more labels, highlighting a fundamental complexity barrier.
Findings
Super-polynomial SQ lower bounds for three-label classification with noise.
Hardness results extend to larger label sets and smaller separation.
Optimal error algorithms are computationally hard in the noisy multiclass setting.
Abstract
We study the task of Multiclass Linear Classification (MLC) in the distribution-free PAC model with Random Classification Noise (RCN). Specifically, the learner is given a set of labeled examples , where is drawn from an unknown distribution on and the labels are generated by a multiclass linear classifier corrupted with RCN. That is, the label is flipped from to with probability according to a known noise matrix with non-negative separation . The goal is to compute a hypothesis with small 0-1 error. For the special case of two labels, prior work has given polynomial-time algorithms achieving the optimal error. Surprisingly, little is known about the complexity of this task even for three labels. As our main contribution, we show that the complexity of MLC with RCN becomes drastically different in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Imbalanced Data Classification Techniques · Advanced Statistical Methods and Models
MethodsSparse Evolutionary Training
