Extreme Bandits using Robust Statistics
Sujay Bhatt, Ping Li, Gennady Samorodnitsky

TL;DR
This paper introduces distribution-free algorithms for extreme bandit problems using robust statistics, achieving better extremal regret and demonstrating superior finite-sample performance compared to existing methods.
Contribution
It presents novel distribution-free algorithms for extreme bandits based on robust statistics, with theoretical guarantees and improved empirical results.
Findings
Algorithms achieve vanishing extremal regret under weaker conditions.
Numerical experiments show superior performance over existing algorithms.
The methods are applicable in finite-sample settings.
Abstract
We consider a multi-armed bandit problem motivated by situations where only the extreme values, as opposed to expected values in the classical bandit setting, are of interest. We propose distribution free algorithms using robust statistics and characterize the statistical properties. We show that the provided algorithms achieve vanishing extremal regret under weaker conditions than existing algorithms. Performance of the algorithms is demonstrated for the finite-sample setting using numerical experiments. The results show superior performance of the proposed algorithms compared to the well known algorithms.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Risk and Portfolio Optimization · Machine Learning and Algorithms
