Forster Decomposition and Learning Halfspaces with Noise
Ilias Diakonikolas, Daniel M. Kane, Christos Tzamos

TL;DR
This paper introduces a method to decompose any distribution into a mixture of distributions with Forster transforms, enabling the first efficient polynomial-time algorithm for learning halfspaces with Massart noise independently of bit complexity.
Contribution
It provides a novel decomposition technique for distributions and applies it to develop a new, efficient PAC learning algorithm for halfspaces under Massart noise.
Findings
Efficient decomposition of distributions into Forster transform-compatible components.
First polynomial-time, distribution-independent PAC learning algorithm for halfspaces with Massart noise.
Sample complexity independent of bit complexity of examples.
Abstract
A Forster transform is an operation that turns a distribution into one with good anti-concentration properties. While a Forster transform does not always exist, we show that any distribution can be efficiently decomposed as a disjoint mixture of few distributions for which a Forster transform exists and can be computed efficiently. As the main application of this result, we obtain the first polynomial-time algorithm for distribution-independent PAC learning of halfspaces in the Massart noise model with strongly polynomial sample complexity, i.e., independent of the bit complexity of the examples. Previous algorithms for this learning problem incurred sample complexity scaling polynomially with the bit complexity, even though such a dependence is not information-theoretically necessary.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Face and Expression Recognition · Metaheuristic Optimization Algorithms Research
