Online Multiclass Boosting with Bandit Feedback

Daniel T. Zhang; Young Hun Jung; Ambuj Tewari

arXiv:1810.05290·stat.ML·February 26, 2019·1 cites

Online Multiclass Boosting with Bandit Feedback

Daniel T. Zhang, Young Hun Jung, Ambuj Tewari

PDF

Open Access 1 Repo

TL;DR

This paper introduces online multiclass boosting algorithms that operate under bandit feedback, enabling effective learning with limited feedback by estimating losses unbiasedly and extending full information algorithms to this setting.

Contribution

It develops unbiased loss estimation methods and extends existing boosting algorithms to the bandit feedback scenario, matching their error bounds.

Findings

01

Error bounds match full information algorithms

02

Sample complexity increases with limited feedback

03

Performance is comparable to existing bandit boosting methods

Abstract

We present online boosting algorithms for multiclass classification with bandit feedback, where the learner only receives feedback about the correctness of its prediction. We propose an unbiased estimate of the loss using a randomized prediction, allowing the model to update its weak learners with limited information. Using the unbiased estimate, we extend two full information boosting algorithms (Jung et al., 2017) to the bandit setting. We prove that the asymptotic error bounds of the bandit algorithms exactly match their full information counterparts. The cost of restricted feedback is reflected in the larger sample complexity. Experimental results also support our theoretical findings, and performance of the proposed models is comparable to that of an existing bandit boosting algorithm, which is limited to use binary weak learners.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pi224/banditboosting
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Stream Mining Techniques · Advanced Bandit Algorithms Research · Machine Learning and Algorithms