Simultaneously Learning Stochastic and Adversarial Bandits with General   Graph Feedback

Fang Kong; Yichi Zhou; Shuai Li

arXiv:2206.07908·cs.LG·August 23, 2022·1 cites

Simultaneously Learning Stochastic and Adversarial Bandits with General Graph Feedback

Fang Kong, Yichi Zhou, Shuai Li

PDF

Open Access

TL;DR

This paper introduces a new algorithm for online learning with general graph feedback that effectively balances exploration and exploitation, achieving near-optimal regret in both stochastic and adversarial settings.

Contribution

It presents the first best-of-both-worlds algorithm for general feedback graphs, handling both stochastic and adversarial environments without prior feedback knowledge.

Findings

01

Achieves polylogarithmic regret in stochastic setting

02

Attains minimax-optimal regret in adversarial setting

03

Works with general, directed feedback graphs

Abstract

The problem of online learning with graph feedback has been extensively studied in the literature due to its generality and potential to model various learning tasks. Existing works mainly study the adversarial and stochastic feedback separately. If the prior knowledge of the feedback mechanism is unavailable or wrong, such specially designed algorithms could suffer great loss. To avoid this problem, \citet{erez2021towards} try to optimize for both environments. However, they assume the feedback graphs are undirected and each vertex has a self-loop, which compromises the generality of the framework and may not be satisfied in applications. With a general feedback graph, the observation of an arm may not be available when this arm is pulled, which makes the exploration more expensive and the algorithms more challenging to perform optimally in both environments. In this work, we overcome…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Optimization and Search Problems