Nearly Optimal Algorithms for Piecewise-Stationary Cascading Bandits

Lingda Wang; Huozhi Zhou; Bingcong Li; Lav R. Varshney; Zhizhen Zhao

arXiv:1909.05886·cs.LG·February 18, 2020·6 cites

Nearly Optimal Algorithms for Piecewise-Stationary Cascading Bandits

Lingda Wang, Huozhi Zhou, Bingcong Li, Lav R. Varshney, Zhizhen Zhao

PDF

Open Access

TL;DR

This paper introduces nearly optimal algorithms for piecewise-stationary cascading bandits, effectively detecting change points and adapting to evolving user preferences with improved regret bounds and minimal tuning.

Contribution

The paper proposes two new algorithms using a generalized likelihood ratio test for change detection in non-stationary cascading bandits, achieving near-optimal regret bounds and fewer tuning parameters.

Findings

01

Regret bounds of order O(√NLT log T) for the proposed algorithms.

02

Algorithms are nearly optimal, matching the lower bound up to a logarithmic factor.

03

Numerical experiments confirm the effectiveness of the algorithms on real and synthetic data.

Abstract

Cascading bandit (CB) is a popular model for web search and online advertising, where an agent aims to learn the $K$ most attractive items out of a ground set of size $L$ during the interaction with a user. However, the stationary CB model may be too simple to apply to real-world problems, where user preferences may change over time. Considering piecewise-stationary environments, two efficient algorithms, \texttt{GLRT-CascadeUCB} and \texttt{GLRT-CascadeKL-UCB}, are developed and shown to ensure regret upper bounds on the order of $O (N L T lo g T)$ , where $N$ is the number of piecewise-stationary segments, and $T$ is the number of time slots. At the crux of the proposed algorithms is an almost parameter-free change-point detector, the generalized likelihood ratio test (GLRT). Comparing with existing works, the GLRT-based algorithms: i) are free of change-point-dependent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Machine Learning and Algorithms