Risk-aware linear bandits with convex loss
Patrick Saux (Inria Scool, CRIStAL, Univ. Lille), Odalric-Ambrym, Maillard (Inria Scool, CRIStAL, Univ. Lille)

TL;DR
This paper introduces a new framework for risk-aware contextual bandits using convex loss functions, deriving confidence sequences and proposing algorithms with regret guarantees for critical applications.
Contribution
It extends risk-aware bandit algorithms to contextual settings with convex loss functions, providing confidence sequences and practical algorithms with theoretical guarantees.
Findings
Proposed a convex loss-based risk measure estimation method.
Developed an optimistic UCB algorithm with regret guarantees.
Validated algorithms through numerical experiments.
Abstract
In decision-making problems such as the multi-armed bandit, an agent learns sequentially by optimizing a certain feedback. While the mean reward criterion has been extensively studied, other measures that reflect an aversion to adverse outcomes, such as mean-variance or conditional value-at-risk (CVaR), can be of interest for critical applications (healthcare, agriculture). Algorithms have been proposed for such risk-aware measures under bandit feedback without contextual information. In this work, we study contextual bandits where such risk measures can be elicited as linear functions of the contexts through the minimization of a convex loss. A typical example that fits within this framework is the expectile measure, which is obtained as the solution of an asymmetric least-square problem. Using the method of mixtures for supermartingales, we derive confidence sequences for the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Sparse and Compressive Sensing Techniques · Smart Grid Energy Management
