Taking a hint: How to leverage loss predictors in contextual bandits?

Chen-Yu Wei; Haipeng Luo; Alekh Agarwal

arXiv:2003.01922·cs.LG·October 16, 2020·5 cites

Taking a hint: How to leverage loss predictors in contextual bandits?

Chen-Yu Wei, Haipeng Luo, Alekh Agarwal

PDF

Open Access

TL;DR

This paper explores how loss predictors can improve regret bounds in contextual bandits, revealing new bounds and algorithms for various settings, including adversarial and stochastic environments.

Contribution

It provides the first comprehensive analysis of leveraging loss predictors in contextual bandits, establishing tight bounds and novel algorithms for different scenarios.

Findings

01

Optimal regret with known error is O(min{√T, √E T^{1/4}}).

02

Unknown error case achieves regret O(√E T^{1/3}).

03

Linear dependence on the number of predictors is necessary.

Abstract

We initiate the study of learning in contextual bandits with the help of loss predictors. The main question we address is whether one can improve over the minimax regret $O (T)$ for learning over $T$ rounds, when the total error of the predictor $E \leq T$ is relatively small. We provide a complete answer to this question, including upper and lower bounds for various settings: adversarial versus stochastic environments, known versus unknown $E$ , and single versus multiple predictors. We show several surprising results, such as 1) the optimal regret is $O (min {T, E T^{\frac{1}{4}}})$ when $E$ is known, a sharp contrast to the standard and better bound $O (E)$ for non-contextual problems (such as multi-armed bandits); 2) the same bound cannot be achieved if $E$ is unknown,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Consumer Market Behavior and Pricing · Auction Theory and Applications