Optimal Streaming Algorithms for Multi-Armed Bandits
Tianyuan Jin, Keke Huang, Jing Tang, Xiaokui Xiao

TL;DR
This paper develops optimal streaming algorithms for best arm identification in multi-armed bandits, achieving minimal memory and pass complexity for top-k and best-arm identification problems.
Contribution
It introduces algorithms that attain optimal sample complexity with minimal memory and passes for streaming multi-armed bandit problems, extending previous work to general k.
Findings
Achieves optimal sample complexity for top-k identification with single-arm memory.
Provides near instance-dependent optimal passes for best-arm identification.
Extends results to general k with efficient streaming algorithms.
Abstract
This paper studies two variants of the best arm identification (BAI) problem under the streaming model, where we have a stream of arms with reward distributions supported on with unknown means. The arms in the stream are arriving one by one, and the algorithm cannot access an arm unless it is stored in a limited size memory. We first study the streaming \eps-- arms identification problem, which asks for arms whose reward means are lower than that of the -th best arm by at most with probability at least . For general , the existing solution for this problem assumes and achieves the optimal sample complexity using ( equals the number of times that we need to apply the logarithm function on before the results is no more than 1.) memory and a single…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Smart Grid Energy Management
