Robust Online Classification: From Estimation to Denoising

Changlong Wu; Ananth Grama; Wojciech Szpankowski

arXiv:2309.01698·cs.LG·September 27, 2024·1 cites

Robust Online Classification: From Estimation to Denoising

Changlong Wu, Ananth Grama, Wojciech Szpankowski

PDF

Open Access

TL;DR

This paper develops a theoretical framework for online classification under noisy labels, providing tight minimax risk bounds that account for adversarial features and stochastic label noise, with novel reduction techniques.

Contribution

It introduces a new characterization of minimax risk in noisy online classification using the Hellinger gap and develops a reduction to hypothesis comparison with a novel testing method.

Findings

01

Minimax risk is tightly characterized by the Hellinger gap of noisy label distributions.

02

The approach is robust to adversarial features and stochastic label noise.

03

Provides the first comprehensive guarantees for noisy online classification.

Abstract

We study online classification of features into labels with general hypothesis classes. In our setting, true labels are determined by some function within the hypothesis class but are corrupted by unknown stochastic noise, and the features are generated adversarially. Predictions are made using observed noisy labels and noiseless features, while the performance is measured via minimax risk when comparing against true labels. The noise mechanism is modeled via a general noise kernel that specifies, for any individual data point, a set of distributions from which the actual noisy label distribution is chosen. We show that minimax risk is tightly characterized (up to a logarithmic factor of the hypothesis class size) by the Hellinger gap of the noisy label distributions induced by the kernel, independent of other properties such as the means and variances of the noise. Our main technique…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Stream Mining Techniques · Advanced Bandit Algorithms Research · Misinformation and Its Impacts