Safe Linear Leveling Bandits

Ilker Demirel; Mehmet Ufuk Ozdemir; Cem Tekin

arXiv:2112.06728·cs.LG·December 14, 2021

Safe Linear Leveling Bandits

Ilker Demirel, Mehmet Ufuk Ozdemir, Cem Tekin

PDF

Open Access

TL;DR

This paper introduces SALE-LTS, a new safe linear bandit algorithm that maintains outcomes near a target level under safety constraints, with proven sublinear regret and promising empirical results.

Contribution

The paper proposes SALE-LTS, a novel acquisition strategy for safe linear bandits focused on leveling, extending traditional reward maximization to safety-critical target level maintenance.

Findings

01

Achieves sublinear regret comparable to classical bandit algorithms.

02

Demonstrates effective empirical performance in safety-constrained scenarios.

03

Introduces a new approach for safety in linear stochastic bandits.

Abstract

Multi-armed bandits (MAB) are extensively studied in various settings where the objective is to \textit{maximize} the actions' outcomes (i.e., rewards) over time. Since safety is crucial in many real-world problems, safe versions of MAB algorithms have also garnered considerable interest. In this work, we tackle a different critical task through the lens of \textit{linear stochastic bandits}, where the aim is to keep the actions' outcomes close to a target level while respecting a \textit{two-sided} safety constraint, which we call \textit{leveling}. Such a task is prevalent in numerous domains. Many healthcare problems, for instance, require keeping a physiological variable in a range and preferably close to a target level. The radical change in our objective necessitates a new acquisition strategy, which is at the heart of a MAB algorithm. We propose SALE-LTS: Safe Leveling via Linear…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Decision-Making and Behavioral Economics · Smart Grid Energy Management