Multiclass Online Learnability under Bandit Feedback

Ananth Raman; Vinod Raman; Unique Subedi; Idan Mehalel; Ambuj Tewari

arXiv:2308.04620·cs.LG·January 23, 2024

Multiclass Online Learnability under Bandit Feedback

Ananth Raman, Vinod Raman, Unique Subedi, Idan Mehalel, Ambuj Tewari

PDF

Open Access

TL;DR

This paper characterizes the conditions under which online multiclass classification with bandit feedback is learnable, establishing the Bandit Littlestone dimension as a key measure and highlighting differences from full-information scenarios.

Contribution

It extends the theoretical understanding of bandit online learnability by linking it to the Bandit Littlestone dimension and contrasting it with full-information settings.

Findings

01

Finiteness of Bandit Littlestone dimension is necessary and sufficient for learnability.

02

Sequential uniform convergence is necessary but not sufficient for bandit learnability.

03

Results apply even when the label space is unbounded.

Abstract

We study online multiclass classification under bandit feedback. We extend the results of Daniely and Helbertal [2013] by showing that the finiteness of the Bandit Littlestone dimension is necessary and sufficient for bandit online learnability even when the label space is unbounded. Moreover, we show that, unlike the full-information setting, sequential uniform convergence is necessary but not sufficient for bandit online learnability. Our result complements the recent work by Hanneke, Moran, Raman, Subedi, and Tewari [2023] who show that the Littlestone dimension characterizes online multiclass learnability in the full-information setting even when the label space is unbounded.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Domain Adaptation and Few-Shot Learning