Corruption Robust Active Learning

Yifang Chen; Simon S. Du; Kevin Jamieson

arXiv:2106.11220·cs.LG·June 22, 2021

Corruption Robust Active Learning

Yifang Chen, Simon S. Du, Kevin Jamieson

PDF

Open Access 1 Video

TL;DR

This paper studies streaming active learning for binary classification under label corruptions, proposing algorithms that are robust to adversarial corruptions and nearly optimal in label complexity.

Contribution

It introduces a new algorithm that is provably correct under any corruption setting and achieves near-minimax label complexity with minimal additional labels.

Findings

01

Classical RobustCAL performs well under benign corruptions.

02

The new algorithm is robust without assumptions on corruptions.

03

Achieves near-optimal label complexity in both corrupted and non-corrupted settings.

Abstract

We conduct theoretical studies on streaming-based active learning for binary classification under unknown adversarial label corruptions. In this setting, every time before the learner observes a sample, the adversary decides whether to corrupt the label or not. First, we show that, in a benign corruption setting (which includes the misspecification setting as a special case), with a slight enlargement on the hypothesis elimination threshold, the classical RobustCAL framework can (surprisingly) achieve nearly the same label complexity guarantee as in the non-corrupted setting. However, this algorithm can fail in the general corruption setting. To resolve this drawback, we propose a new algorithm which is provably correct without any assumptions on the presence of corruptions. Furthermore, this algorithm enjoys the minimax label complexity in the non-corrupted setting (which is achieved…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Corruption Robust Active Learning· slideslive

Taxonomy

TopicsMachine Learning and Algorithms · Imbalanced Data Classification Techniques · Computability, Logic, AI Algorithms