Tight Memory-Regret Lower Bounds for Streaming Bandits

Shaoang Li; Lan Zhang; Junhao Wang; Xiang-Yang Li

arXiv:2306.07903·cs.LG·June 14, 2023·1 cites

Tight Memory-Regret Lower Bounds for Streaming Bandits

Shaoang Li, Lan Zhang, Junhao Wang, Xiang-Yang Li

PDF

Open Access

TL;DR

This paper establishes tight lower bounds on regret for streaming bandits with limited memory, revealing fundamental limits and differences from classical bandit settings, and proposes an algorithm matching these bounds.

Contribution

It introduces the first tight regret lower bounds for streaming bandits with sublinear memory and provides an algorithm that nearly matches these bounds.

Findings

01

Lower bound of (TB)^{^{B}/(2^{B+1}-1)} K^{1-^{B}/(2^{B+1}-1)} for streaming bandits.

02

An unavoidable double logarithmic factor compared to classical lower bound.

03

A first instance-dependent lower bound of T^{1/(B+1)} for streaming bandits.

Abstract

In this paper, we investigate the streaming bandits problem, wherein the learner aims to minimize regret by dealing with online arriving arms and sublinear arm memory. We establish the tight worst-case regret lower bound of $Ω ((T B)^{α} K^{1 - α}), α = 2^{B} / (2^{B + 1} - 1)$ for any algorithm with a time horizon $T$ , number of arms $K$ , and number of passes $B$ . The result reveals a separation between the stochastic bandits problem in the classical centralized setting and the streaming setting with bounded arm memory. Notably, in comparison to the well-known $Ω (K T)$ lower bound, an additional double logarithmic factor is unavoidable for any streaming bandits algorithm with sublinear memory permitted. Furthermore, we establish the first instance-dependent lower bound of $Ω (T^{1/ (B + 1)} \sum_{Δ_{x} > 0} \frac{μ ^{*}}{Δ _{x}})$ for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Stochastic Gradient Optimization Techniques