Directional Optimism for Safe Linear Bandits

Spencer Hutchinson; Berkay Turan; Mahnoosh Alizadeh

arXiv:2308.15006·cs.LG·March 13, 2024

Directional Optimism for Safe Linear Bandits

Spencer Hutchinson, Berkay Turan, Mahnoosh Alizadeh

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new directional optimism approach for safe linear bandits, leading to improved regret guarantees and empirical performance, and extends the setting to convex constraints with a novel analysis method.

Contribution

It proposes a novel directional optimism technique, an improved algorithm with better empirical results, and extends the framework to convex constraints using convex analysis.

Findings

01

Improved regret guarantees for safe linear bandits.

02

Enhanced empirical performance over existing algorithms.

03

Extension to convex constraints with a new analytical approach.

Abstract

The safe linear bandit problem is a version of the classical stochastic linear bandit problem where the learner's actions must satisfy an uncertain constraint at all rounds. Due its applicability to many real-world settings, this problem has received considerable attention in recent years. By leveraging a novel approach that we call directional optimism, we find that it is possible to achieve improved regret guarantees for both well-separated problem instances and action sets that are finite star convex sets. Furthermore, we propose a novel algorithm for this setting that improves on existing algorithms in terms of empirical performance, while enjoying matching regret guarantees. Lastly, we introduce a generalization of the safe linear bandit setting where the constraints are convex and adapt our algorithms and analyses to this setting by leveraging a novel convex-analysis based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shutch1/directional-optimism-for-safe-linear-bandits
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Machine Learning and Algorithms