Learning Multiclass Classifier Under Noisy Bandit Feedback

Mudit Agarwal; Naresh Manwani

arXiv:2006.03545·cs.LG·June 8, 2021

Learning Multiclass Classifier Under Noisy Bandit Feedback

Mudit Agarwal, Naresh Manwani

PDF

TL;DR

This paper introduces a new method for multiclass classification with noisy bandit feedback, using unbiased estimators and noise rate estimation to improve learning under corruption.

Contribution

It presents a novel unbiased estimator-based approach and an efficient noise rate estimation method for multiclass bandit learning with noisy feedback.

Findings

01

Mistake bound of O(√T) in high noise scenarios

02

Mistake bound of O(T^{2/3}) in worst case

03

Effective performance demonstrated on benchmark datasets

Abstract

This paper addresses the problem of multiclass classification with corrupted or noisy bandit feedback. In this setting, the learner may not receive true feedback. Instead, it receives feedback that has been flipped with some non-zero probability. We propose a novel approach to deal with noisy bandit feedback based on the unbiased estimator technique. We further offer a method that can efficiently estimate the noise rates, thus providing an end-to-end framework. The proposed algorithm enjoys a mistake bound of the order of $O (T)$ in the high noise case and of the order of $O (T^{\nicefrac 23})$ in the worst case. We show our approach's effectiveness using extensive experiments on several benchmark datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.