Corruption Robust Active Learning
Yifang Chen, Simon S. Du, Kevin Jamieson

TL;DR
This paper studies streaming active learning for binary classification under label corruptions, proposing algorithms that are robust to adversarial corruptions and nearly optimal in label complexity.
Contribution
It introduces a new algorithm that is provably correct under any corruption setting and achieves near-minimax label complexity with minimal additional labels.
Findings
Classical RobustCAL performs well under benign corruptions.
The new algorithm is robust without assumptions on corruptions.
Achieves near-optimal label complexity in both corrupted and non-corrupted settings.
Abstract
We conduct theoretical studies on streaming-based active learning for binary classification under unknown adversarial label corruptions. In this setting, every time before the learner observes a sample, the adversary decides whether to corrupt the label or not. First, we show that, in a benign corruption setting (which includes the misspecification setting as a special case), with a slight enlargement on the hypothesis elimination threshold, the classical RobustCAL framework can (surprisingly) achieve nearly the same label complexity guarantee as in the non-corrupted setting. However, this algorithm can fail in the general corruption setting. To resolve this drawback, we propose a new algorithm which is provably correct without any assumptions on the presence of corruptions. Furthermore, this algorithm enjoys the minimax label complexity in the non-corrupted setting (which is achieved…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMachine Learning and Algorithms · Imbalanced Data Classification Techniques · Computability, Logic, AI Algorithms
