Tight Bounds for Bandit Combinatorial Optimization

Alon Cohen; Tamir Hazan; Tomer Koren

arXiv:1702.07539·cs.LG·February 27, 2017·2 cites

Tight Bounds for Bandit Combinatorial Optimization

Alon Cohen, Tamir Hazan, Tomer Koren

PDF

Open Access

TL;DR

This paper establishes tight bounds on the regret rates in bandit combinatorial optimization, showing the growth rate as (k^{3/2}\u221a(dT)), which refutes previous conjectures and applies to key problems like the bandit shortest path.

Contribution

It proves the exact regret growth rate in bandit combinatorial optimization, resolving open problems and disproving prior conjectures about the optimal regret bounds.

Findings

01

Regret grows as (k^{3/2}(dT))

02

Disproves the conjecture that the rate is (k\u221a(dT))

03

Provides tight bounds for the bandit shortest path problem

Abstract

We revisit the study of optimal regret rates in bandit combinatorial optimization---a fundamental framework for sequential decision making under uncertainty that abstracts numerous combinatorial prediction problems. We prove that the attainable regret in this setting grows as $Θ (k^{3/2} d T)$ where $d$ is the dimension of the problem and $k$ is a bound over the maximal instantaneous loss, disproving a conjecture of Audibert, Bubeck, and Lugosi (2013) who argued that the optimal rate should be of the form $Θ (k d T)$ . Our bounds apply to several important instances of the framework, and in particular, imply a tight bound for the well-studied bandit shortest path problem. By that, we also resolve an open problem posed by Cesa-Bianchi and Lugosi (2012).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Auction Theory and Applications · Decision-Making and Behavioral Economics