Indexed Minimum Empirical Divergence for Unimodal Bandits

Hassan Saber (CRIStAL; Scool); Pierre M\'enard (OVGU); Odalric-Ambrym; Maillard (Scool)

arXiv:2112.01452·cs.AI·December 3, 2021

Indexed Minimum Empirical Divergence for Unimodal Bandits

Hassan Saber (CRIStAL, Scool), Pierre M\'enard (OVGU), Odalric-Ambrym, Maillard (Scool)

PDF

Open Access 1 Video

TL;DR

This paper introduces IMED-UB, an algorithm tailored for unimodal multi-armed bandit problems with exponential distributions, providing optimal exploitation of the unimodal structure and a concise finite-time analysis.

Contribution

The paper presents IMED-UB, a novel algorithm that adapts the IMED approach to unimodal bandits, with a new proof technique and finite-time performance guarantees.

Findings

01

IMED-UB performs competitively with state-of-the-art algorithms.

02

Finite-time analysis of IMED-UB is provided.

03

Numerical experiments validate the effectiveness of IMED-UB.

Abstract

We consider a multi-armed bandit problem specified by a set of one-dimensional family exponential distributions endowed with a unimodal structure. We introduce IMED-UB, a algorithm that optimally exploits the unimodal-structure, by adapting to this setting the Indexed Minimum Empirical Divergence (IMED) algorithm introduced by Honda and Takemura [2015]. Owing to our proof technique, we are able to provide a concise finite-time analysis of IMED-UB algorithm. Numerical experiments show that IMED-UB competes with the state-of-the-art algorithms.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Indexed Minimum Empirical Divergence for Unimodal Bandits· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Machine Learning and Algorithms