Self-Training of Halfspaces with Generalization Guarantees under Massart   Mislabeling Noise Model

Lies Hadjadj; Massih-Reza Amini; Sana Louhichi; Alexis Deschamps

arXiv:2111.14427·cs.LG·February 16, 2022·1 cites

Self-Training of Halfspaces with Generalization Guarantees under Massart Mislabeling Noise Model

Lies Hadjadj, Massih-Reza Amini, Sana Louhichi, Alexis Deschamps

PDF

Open Access

TL;DR

This paper presents a semi-supervised self-training algorithm for halfspaces with theoretical generalization guarantees under the Massart noise model, demonstrating improved efficiency over existing methods.

Contribution

It introduces a novel self-training algorithm with exploration and pruning phases, providing theoretical bounds and performance guarantees under label noise.

Findings

01

Bounded misclassification error for the sequence of classifiers

02

Performance guarantee that does not degrade compared to initial labeled data

03

Empirical results show superior efficiency over state-of-the-art methods

Abstract

We investigate the generalization properties of a self-training algorithm with halfspaces. The approach learns a list of halfspaces iteratively from labeled and unlabeled training data, in which each iteration consists of two steps: exploration and pruning. In the exploration phase, the halfspace is found sequentially by maximizing the unsigned-margin among unlabeled examples and then assigning pseudo-labels to those that have a distance higher than the current threshold. The pseudo-labeled examples are then added to the training set, and a new classifier is learned. This process is repeated until no more unlabeled examples remain for pseudo-labeling. In the pruning phase, pseudo-labeled samples that have a distance to the last halfspace greater than the associated unsigned-margin are then discarded. We prove that the misclassification error of the resulting sequence of classifiers is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Music and Audio Processing · Face and Expression Recognition

MethodsPruning