Extreme bandits

Alexandra Carpentier; Michal Valko

arXiv:2604.24545·stat.ML·April 28, 2026

Extreme bandits

Alexandra Carpentier, Michal Valko

PDF

1 Datasets

TL;DR

This paper introduces the ExtremeHunter algorithm for sequential resource allocation aimed at detecting extreme values across sources, focusing on minimizing extreme regret rather than average reward.

Contribution

The work presents a novel algorithm for extreme value detection in resource-limited settings and analyzes its theoretical properties and empirical performance.

Findings

01

ExtremeHunter effectively detects sources with heaviest tails.

02

The algorithm outperforms baseline methods in synthetic experiments.

03

Real-world tests confirm its practical utility.

Abstract

In many areas of medicine, security, and life sciences, we want to allocate limited resources to different sources in order to detect extreme values. In this paper, we study an efficient way to allocate these resources sequentially under limited feedback. While sequential design of experiments is well studied in bandit theory, the most commonly optimized property is the regret with respect to the maximum mean reward. However, in other problems such as network intrusion detection, we are interested in detecting the most extreme value output by the sources. Therefore, in our work we study extreme regret which measures the efficiency of an algorithm compared to the oracle policy selecting the source with the heaviest tail. We propose the ExtremeHunter algorithm, provide its analysis, and evaluate it empirically on synthetic and real-world experiments.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

misovalko/my-research-papers
dataset· 103 dl
103 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.