A Sharp Memory-Regret Trade-Off for Multi-Pass Streaming Bandits

Arpit Agarwal; Sanjeev Khanna; Prathamesh Patil

arXiv:2205.00984·cs.LG·May 3, 2022

A Sharp Memory-Regret Trade-Off for Multi-Pass Streaming Bandits

Arpit Agarwal, Sanjeev Khanna, Prathamesh Patil

PDF

Open Access

TL;DR

This paper investigates the trade-off between memory and regret in multi-pass streaming bandit algorithms, revealing a sharp transition where limited memory suffices for near-optimal regret, but additional memory yields minimal gains.

Contribution

It establishes tight upper and lower bounds on regret for any number of passes, uncovering a surprising phase transition in memory efficiency for streaming bandits.

Findings

01

O(1) memory achieves near-optimal regret with multiple passes

02

Increasing memory beyond o(K) offers negligible regret reduction

03

Lower bounds are proved using information-theoretic and round elimination techniques

Abstract

The stochastic $K$ -armed bandit problem has been studied extensively due to its applications in various domains ranging from online advertising to clinical trials. In practice however, the number of arms can be very large resulting in large memory requirements for simultaneously processing them. In this paper we consider a streaming setting where the arms are presented in a stream and the algorithm uses limited memory to process these arms. Here, the goal is not only to minimize regret, but also to do so in minimal memory. Previous algorithms for this problem operate in one of the two settings: they either use $Ω (lo g lo g T)$ passes over the stream (Rathod, 2021; Chaudhuri and Kalyanakrishnan, 2020; Liau et al., 2018), or just a single pass (Maiti et al., 2021). In this paper we study the trade-off between memory and regret when $B$ passes over the stream are allowed, for any $B…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Stochastic Gradient Optimization Techniques