Safety Aware Changepoint Detection for Piecewise i.i.d. Bandits

Subhojyoti Mukherjee

arXiv:2205.13689·cs.LG·May 30, 2022

Safety Aware Changepoint Detection for Piecewise i.i.d. Bandits

Subhojyoti Mukherjee

PDF

Open Access

TL;DR

This paper introduces safety-aware algorithms for piecewise i.i.d. bandits that detect changepoints, satisfy safety constraints, and have regret bounds comparable to existing methods, with proven lower bounds and empirical validation.

Contribution

The paper develops the first safety-aware algorithms for piecewise i.i.d. bandits that detect changepoints without prior knowledge, providing regret bounds and matching lower bounds.

Findings

01

Algorithms satisfy safety constraints while detecting changepoints.

02

Regret bounds are comparable to existing safe and non-safe algorithms.

03

Empirical results show competitive performance with state-of-the-art methods.

Abstract

In this paper, we consider the setting of piecewise i.i.d. bandits under a safety constraint. In this piecewise i.i.d. setting, there exists a finite number of changepoints where the mean of some or all arms change simultaneously. We introduce the safety constraint studied in \citet{wu2016conservative} to this setting such that at any round the cumulative reward is above a constant factor of the default action reward. We propose two actively adaptive algorithms for this setting that satisfy the safety constraint, detect changepoints, and restart without the knowledge of the number of changepoints or their locations. We provide regret bounds for our algorithms and show that the bounds are comparable to their counterparts from the safe bandit and piecewise i.i.d. bandit literature. We also provide the first matching lower bounds for this setting. Empirically, we show that our safety-aware…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Healthcare Operations and Scheduling Optimization · Auction Theory and Applications