Learning Noisy Halfspaces with a Margin: Massart is No Harder than   Random

Gautam Chandrasekaran; Vasilis Kontonis; Konstantinos Stavropoulos,; Kevin Tian

arXiv:2501.09851·cs.LG·January 20, 2025

Learning Noisy Halfspaces with a Margin: Massart is No Harder than Random

Gautam Chandrasekaran, Vasilis Kontonis, Konstantinos Stavropoulos,, Kevin Tian

PDF

Open Access 1 Video

TL;DR

This paper introduces the Perspectron algorithm for PAC learning of $ ext{γ}$-margin halfspaces with Massart noise, achieving near-optimal sample complexity and extending to generalized linear models, thus advancing noise-tolerant learning theory.

Contribution

The paper presents a simple proper learning algorithm with improved sample complexity for Massart noise, extending results to generalized linear models under the same noise conditions.

Findings

01

Perspectron achieves $ ilde{O}(( ext{εγ})^{-2})$ sample complexity.

02

The method handles $ ext{γ}$-margin halfspaces with Massart noise effectively.

03

Results extend to generalized linear models with similar efficiency.

Abstract

We study the problem of PAC learning $γ$ -margin halfspaces with Massart noise. We propose a simple proper learning algorithm, the Perspectron, that has sample complexity $O ((ϵ γ)^{- 2})$ and achieves classification error at most $η + ϵ$ where $η$ is the Massart noise rate. Prior works [DGT19,CKMY20] came with worse sample complexity guarantees (in both $ϵ$ and $γ$ ) or could only handle random classification noise [DDK+23,KIT+23] -- a much milder noise assumption. We also show that our results extend to the more challenging setting of learning generalized linear models with a known link function under Massart noise, achieving a similar sample complexity to the halfspace case. This significantly improves upon the prior state-of-the-art in this setting due to [CKMY20], who introduced this model.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Learning Noisy Halfspaces with a Margin: Massart is No Harder than Random· slideslive

Taxonomy

TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Text and Document Classification Technologies