An Asymptotically Optimal Contextual Bandit Algorithm Using Hierarchical   Structures

Mohammadreza Mohaghegh Neyshabouri; Kaan Gokcesu; Huseyin Ozkan and; Suleyman S. Kozat

arXiv:1612.01367·cs.LG·December 11, 2017

An Asymptotically Optimal Contextual Bandit Algorithm Using Hierarchical Structures

Mohammadreza Mohaghegh Neyshabouri, Kaan Gokcesu, Huseyin Ozkan and, Suleyman S. Kozat

PDF

Open Access

TL;DR

This paper introduces an asymptotically optimal algorithm for contextual multi-armed bandits that partitions the context space hierarchically, achieving near-optimal performance even in adversarial settings with efficient implementation.

Contribution

The authors develop a novel hierarchical partitioning approach that adaptively combines mappings to approximate the best arm policy, providing theoretical guarantees and practical efficiency.

Findings

01

Achieves asymptotic optimality in adversarial environments.

02

Provides faster convergence rates than existing methods.

03

Demonstrates superior empirical performance on real and synthetic data.

Abstract

We propose online algorithms for sequential learning in the contextual multi-armed bandit setting. Our approach is to partition the context space and then optimally combine all of the possible mappings between the partition regions and the set of bandit arms in a data driven manner. We show that in our approach, the best mapping is able to approximate the best arm selection policy to any desired degree under mild Lipschitz conditions. Therefore, we design our algorithms based on the optimal adaptive combination and asymptotically achieve the performance of the best mapping as well as the best arm selection policy. This optimality is also guaranteed to hold even in adversarial environments since we do not rely on any statistical assumptions regarding the contexts or the loss of the bandit arms. Moreover, we design efficient implementations for our algorithms in various hierarchical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Machine Learning and ELM