A Survey of Risk-Aware Multi-Armed Bandits
Vincent Y. F. Tan, Prashanth L.A., Krishna Jagannathan

TL;DR
This survey reviews risk-aware multi-armed bandit algorithms, discussing risk measures, concentration inequalities, and problem settings, highlighting challenges and future directions in risk-sensitive decision-making.
Contribution
It consolidates existing research on risk measures in multi-armed bandits, analyzing algorithms for regret minimization and best-arm identification under risk considerations.
Findings
Overview of various risk measures and their properties
Analysis of concentration inequalities for risk measures
Discussion of challenges and future research directions
Abstract
In several applications such as clinical trials and financial portfolio optimization, the expected value (or the average reward) does not satisfactorily capture the merits of a drug or a portfolio. In such applications, risk plays a crucial role, and a risk-aware performance measure is preferable, so as to capture losses in the case of adverse events. This survey aims to consolidate and summarise the existing research on risk measures, specifically in the context of multi-armed bandits. We review various risk measures of interest, and comment on their properties. Next, we review existing concentration inequalities for various risk measures. Then, we proceed to defining risk-aware bandit problems, We consider algorithms for the regret minimization setting, where the exploration-exploitation trade-off manifests, as well as the best-arm identification setting, which is a pure exploration…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Risk and Portfolio Optimization · Reservoir Engineering and Simulation Methods
