Syndicated Bandits: A Framework for Auto Tuning Hyper-parameters in   Contextual Bandit Algorithms

Qin Ding; Yue Kang; Yi-Wei Liu; Thomas C.M. Lee; Cho-Jui Hsieh; James; Sharpnack

arXiv:2106.02979·stat.ML·June 14, 2022·1 cites

Syndicated Bandits: A Framework for Auto Tuning Hyper-parameters in Contextual Bandit Algorithms

Qin Ding, Yue Kang, Yi-Wei Liu, Thomas C.M. Lee, Cho-Jui Hsieh, James, Sharpnack

PDF

Open Access 1 Video

TL;DR

This paper introduces Syndicated Bandits, a novel framework for automatically tuning multiple hyper-parameters in contextual bandit algorithms in real-time, avoiding exponential regret growth and achieving optimal performance.

Contribution

It proposes a general Syndicated Bandits framework for dynamic hyper-parameter tuning in contextual bandits, with proven regret bounds and broad applicability.

Findings

01

Regret bounds are derived for the framework.

02

The method avoids exponential regret dependence on hyper-parameters.

03

Experimental results validate effectiveness on synthetic and real data.

Abstract

The stochastic contextual bandit problem, which models the trade-off between exploration and exploitation, has many real applications, including recommender systems, online advertising and clinical trials. As many other machine learning algorithms, contextual bandit algorithms often have one or more hyper-parameters. As an example, in most optimal stochastic contextual bandit algorithms, there is an unknown exploration parameter which controls the trade-off between exploration and exploitation. A proper choice of the hyper-parameters is essential for contextual bandit algorithms to perform well. However, it is infeasible to use offline tuning methods to select hyper-parameters in contextual bandit environment since there is no pre-collected dataset and the decisions have to be made in real time. To tackle this problem, we first propose a two-layer bandit structure for auto tuning the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Syndicated Bandits: A Framework for Auto Tuning Hyper-parameters in Contextual Bandit Algorithms· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Reinforcement Learning in Robotics