Nearly Tight Bounds for Exploration in Streaming Multi-armed Bandits   with Known Optimality Gap

Nikolai Karpov; Chen Wang

arXiv:2502.01067·cs.LG·February 4, 2025

Nearly Tight Bounds for Exploration in Streaming Multi-armed Bandits with Known Optimality Gap

Nikolai Karpov, Chen Wang

PDF

Open Access

TL;DR

This paper establishes nearly tight bounds on the number of passes needed for exploration in streaming multi-armed bandits with known optimality gap, showing that $ heta( ext{log} n)$ passes are necessary and sufficient.

Contribution

It provides the first tight bounds on pass complexity for streaming multi-armed bandits with known gaps, including a lower bound and a nearly matching algorithm.

Findings

01

Any algorithm with sublinear memory requires at least $rac{ ext{log} n}{ ext{log} ext{log} n}$ passes.

02

A nearly matching algorithm achieves the optimal sample complexity with a single arm memory.

03

The results clarify the trade-offs between passes, memory, and sample complexity in streaming bandit exploration.

Abstract

We investigate the sample-memory-pass trade-offs for pure exploration in multi-pass streaming multi-armed bandits (MABs) with the *a priori* knowledge of the optimality gap $Δ_{[2]}$ . Here, and throughout, the optimality gap $Δ_{[i]}$ is defined as the mean reward gap between the best and the $i$ -th best arms. A recent line of results by Jin, Huang, Tang, and Xiao [ICML'21] and Assadi and Wang [COLT'24] have shown that if there is no known $Δ_{[2]}$ , a pass complexity of $Θ (lo g (1/ Δ_{[2]}))$ (up to $lo g lo g (1/ Δ_{[2]})$ terms) is necessary and sufficient to obtain the *worst-case optimal* sample complexity of $O (n / Δ_{[2]}^{2})$ with a single-arm memory. However, our understanding of multi-pass algorithms with known $Δ_{[2]}$ is still limited. Here, the key open problem is how many passes are required to achieve the complexity, i.e., $O(…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Distributed Sensor Networks and Detection Algorithms · Smart Grid Energy Management