Optimal Strategies for Graph-Structured Bandits
Hassan Saber (SEQUEL), Pierre M\'enard (SEQUEL), Odalric-Ambrym, Maillard (SEQUEL)

TL;DR
This paper investigates optimal strategies for a structured multi-armed bandit problem involving graph-structured relationships between users and arms, deriving lower bounds and proposing an efficient, asymptotically optimal algorithm.
Contribution
It introduces the IMED-GS* algorithm tailored for graph-structured bandits, which is computationally efficient and does not rely on forced exploration, improving upon existing methods.
Findings
IMED-GS* is asymptotically optimal.
The algorithm requires about log(T) linear program solutions.
Numerical results confirm the algorithm's strong performance.
Abstract
We study a structured variant of the multi-armed bandit problem specified by a set of Bernoulli distributions with means and by a given weight matrix , where is a finite set of arms and is a finite set of users. The weight matrix is such that for any two users . This formulation is flexible enough to capture various situations, from highly-structured scenarios () to fully unstructured setups ().We consider two scenarios depending on whether the learner…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Reinforcement Learning in Robotics
