Safe Linear Leveling Bandits
Ilker Demirel, Mehmet Ufuk Ozdemir, Cem Tekin

TL;DR
This paper introduces SALE-LTS, a new safe linear bandit algorithm that maintains outcomes near a target level under safety constraints, with proven sublinear regret and promising empirical results.
Contribution
The paper proposes SALE-LTS, a novel acquisition strategy for safe linear bandits focused on leveling, extending traditional reward maximization to safety-critical target level maintenance.
Findings
Achieves sublinear regret comparable to classical bandit algorithms.
Demonstrates effective empirical performance in safety-constrained scenarios.
Introduces a new approach for safety in linear stochastic bandits.
Abstract
Multi-armed bandits (MAB) are extensively studied in various settings where the objective is to \textit{maximize} the actions' outcomes (i.e., rewards) over time. Since safety is crucial in many real-world problems, safe versions of MAB algorithms have also garnered considerable interest. In this work, we tackle a different critical task through the lens of \textit{linear stochastic bandits}, where the aim is to keep the actions' outcomes close to a target level while respecting a \textit{two-sided} safety constraint, which we call \textit{leveling}. Such a task is prevalent in numerous domains. Many healthcare problems, for instance, require keeping a physiological variable in a range and preferably close to a target level. The radical change in our objective necessitates a new acquisition strategy, which is at the heart of a MAB algorithm. We propose SALE-LTS: Safe Leveling via Linear…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Decision-Making and Behavioral Economics · Smart Grid Energy Management
