Indexed Minimum Empirical Divergence for Unimodal Bandits
Hassan Saber (CRIStAL, Scool), Pierre M\'enard (OVGU), Odalric-Ambrym, Maillard (Scool)

TL;DR
This paper introduces IMED-UB, an algorithm tailored for unimodal multi-armed bandit problems with exponential distributions, providing optimal exploitation of the unimodal structure and a concise finite-time analysis.
Contribution
The paper presents IMED-UB, a novel algorithm that adapts the IMED approach to unimodal bandits, with a new proof technique and finite-time performance guarantees.
Findings
IMED-UB performs competitively with state-of-the-art algorithms.
Finite-time analysis of IMED-UB is provided.
Numerical experiments validate the effectiveness of IMED-UB.
Abstract
We consider a multi-armed bandit problem specified by a set of one-dimensional family exponential distributions endowed with a unimodal structure. We introduce IMED-UB, a algorithm that optimally exploits the unimodal-structure, by adapting to this setting the Indexed Minimum Empirical Divergence (IMED) algorithm introduced by Honda and Takemura [2015]. Owing to our proof technique, we are able to provide a concise finite-time analysis of IMED-UB algorithm. Numerical experiments show that IMED-UB competes with the state-of-the-art algorithms.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Machine Learning and Algorithms
