Regret in Online Combinatorial Optimization

Jean-Yves Audibert; S\'ebastien Bubeck; G\'abor Lugosi

arXiv:1204.4710·cs.LG·April 2, 2013·5 cites

Regret in Online Combinatorial Optimization

Jean-Yves Audibert, S\'ebastien Bubeck, G\'abor Lugosi

PDF

Open Access 1 Repo

TL;DR

This paper investigates the fundamental limits of regret in online combinatorial optimization under various feedback models, providing optimal bounds for semi-bandit and full information cases, and highlighting limitations of standard algorithms.

Contribution

It introduces a combined Mirror Descent and INF strategy to achieve optimal regret bounds and establishes new lower bounds and conjectures for the bandit setting.

Findings

01

Optimal regret bounds for semi-bandit feedback.

02

Recovery of known bounds for full information case.

03

Standard exponentially weighted forecaster is suboptimal.

Abstract

We address online linear optimization problems when the possible actions of the decision maker are represented by binary vectors. The regret of the decision maker is the difference between her realized loss and the best loss she would have achieved by picking, in hindsight, the best possible action. Our goal is to understand the magnitude of the best possible (minimax) regret. We study the problem under three different assumptions for the feedback the decision maker receives: full information, and the partial information models of the so-called "semi-bandit" and "bandit" problems. Combining the Mirror Descent algorithm and the INF (Implicitely Normalized Forecaster) strategy, we are able to prove optimal bounds for the semi-bandit case. We also recover the optimal bounds for the full information setting. In the bandit case we discuss existing results in light of a new lower bound, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

gitting-guud/GML_Project
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Auction Theory and Applications