Risk-Aware Decision Making in Restless Bandits: Theory and Algorithms for Planning and Learning
Nima Akbarzadeh, Yossiri Adulyasak, Erick Delage

TL;DR
This paper extends restless bandits to include risk-awareness, providing new theoretical conditions, a Whittle index solution for planning, and a Thompson sampling approach for learning, with demonstrated risk reduction in practical scenarios.
Contribution
It introduces risk-aware objectives into restless bandits, establishes indexability conditions, and develops planning and learning algorithms with theoretical guarantees.
Findings
Proposed a risk-aware Whittle index for restless bandits.
Developed a Thompson sampling algorithm with sublinear regret.
Numerical experiments show effective risk mitigation in applications.
Abstract
In restless bandits, a central agent is tasked with optimally distributing limited resources across several bandits (arms), with each arm being a Markov decision process. In this work, we generalize the traditional restless bandits problem with a risk-neutral objective by incorporating risk-awareness, which is particularly important in various real-world applications especially when the decision maker seeks to mitigate downside risks. We establish indexability conditions for the case of a risk-aware objective and provide a solution based on Whittle index for the first time for the planning problem with finite-horizon non-stationary and for infinite-horizon stationary Markov decision processes. In addition, we address the learning problem when the true transition probabilities are unknown by proposing a Thompson sampling approach and show that it achieves bounded regret that scales…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Reinforcement Learning in Robotics
MethodsSparse Evolutionary Training
