More Adaptive Algorithms for Adversarial Bandits

Chen-Yu Wei; Haipeng Luo

arXiv:1801.03265·cs.LG·June 8, 2018·40 cites

More Adaptive Algorithms for Adversarial Bandits

Chen-Yu Wei, Haipeng Luo

PDF

Open Access

TL;DR

This paper introduces a versatile, parameter-free algorithm for adversarial multi-armed bandits that achieves various data-dependent regret bounds, improving upon previous methods and adapting to different problem complexities.

Contribution

The paper presents a new adaptive algorithm based on Online Mirror Descent with a log-barrier regularizer, achieving multiple novel regret bounds and enhanced adaptability.

Findings

01

Achieves regret depending on variance of the best arm

02

Provides regret bounds based on first-order path-lengths

03

Ensures small regret in i.i.d. and other settings

Abstract

We develop a novel and generic algorithm for the adversarial multi-armed bandit problem (or more generally the combinatorial semi-bandit problem). When instantiated differently, our algorithm achieves various new data-dependent regret bounds improving previous work. Examples include: 1) a regret bound depending on the variance of only the best arm; 2) a regret bound depending on the first-order path-length of only the best arm; 3) a regret bound depending on the sum of first-order path-lengths of all arms as well as an important negative term, which together lead to faster convergence rates for some normal form games with partial feedback; 4) a regret bound that simultaneously implies small regret when the best arm has small loss and logarithmic regret when there exists an arm whose expected loss is always smaller than those of others by a fixed gap (e.g. the classic i.i.d. setting). In…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Adversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications