Statistically Robust, Risk-Averse Best Arm Identification in Multi-Armed   Bandits

Anmol Kagrecha; Jayakrishnan Nair; and Krishna Jagannathan

arXiv:2008.13629·cs.LG·March 29, 2022

Statistically Robust, Risk-Averse Best Arm Identification in Multi-Armed Bandits

Anmol Kagrecha, Jayakrishnan Nair, and Krishna Jagannathan

PDF

Open Access

TL;DR

This paper introduces statistically robust, risk-aware algorithms for best arm identification in multi-armed bandits that perform well under mild assumptions and are resistant to distributional misspecification.

Contribution

It establishes fundamental performance limits and proposes near-optimal algorithms for robust, risk-aware best arm identification under mild distributional assumptions.

Findings

01

Established performance bounds for robust algorithms.

02

Proposed two classes of near-optimal algorithms.

03

Unified framework for light-tailed and heavy-tailed distributions.

Abstract

Traditional multi-armed bandit (MAB) formulations usually make certain assumptions about the underlying arms' distributions, such as bounds on the support or their tail behaviour. Moreover, such parametric information is usually 'baked' into the algorithms. In this paper, we show that specialized algorithms that exploit such parametric information are prone to inconsistent learning performance when the parameter is misspecified. Our key contributions are twofold: (i) We establish fundamental performance limits of statistically robust MAB algorithms under the fixed-budget pure exploration setting, and (ii) We propose two classes of algorithms that are asymptotically near-optimal. Additionally, we consider a risk-aware criterion for best arm identification, where the objective associated with each arm is a linear combination of the mean and the conditional value at risk (CVaR).…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Reinforcement Learning in Robotics