Complexity Analysis of a Countable-armed Bandit Problem

Anand Kalvit; Assaf Zeevi

arXiv:2301.07243·cs.LG·January 19, 2023

Complexity Analysis of a Countable-armed Bandit Problem

Anand Kalvit, Assaf Zeevi

PDF

Open Access

TL;DR

This paper analyzes a complex multi-armed bandit problem with a large action space and unknown arm-type distributions, proposing algorithms that achieve near-optimal regret bounds and highlighting key differences from classical bandit problems.

Contribution

It introduces algorithms for a countable-armed bandit problem with unknown arm-types, achieving optimal regret bounds and revealing unique complexity aspects distinct from classical models.

Findings

01

Instance-dependent regret is 40d log n

02

Instance-independent regret is 4aa for K=2

03

Performance bounds and algorithm design differ from classical MAB problems

Abstract

We consider a stochastic multi-armed bandit (MAB) problem motivated by ``large'' action spaces, and endowed with a population of arms containing exactly $K$ arm-types, each characterized by a distinct mean reward. The decision maker is oblivious to the statistical properties of reward distributions as well as the population-level distribution of different arm-types, and is precluded also from observing the type of an arm after play. We study the classical problem of minimizing the expected cumulative regret over a horizon of play $n$ , and propose algorithms that achieve a rate-optimal finite-time instance-dependent regret of $O (lo g n)$ . We also show that the instance-independent (minimax) regret is $\tilde{O} (n)$ when $K = 2$ . While the order of regret and complexity of the problem suggests a great degree of similarity to the classical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Auction Theory and Applications · Optimization and Search Problems