Data-Driven Threshold Machine: Scan Statistics, Change-Point Detection, and Extreme Bandits
Shuang Li, Yao Xie, and Le Song

TL;DR
The paper introduces a data-driven, distribution-free threshold machine (DTM) that adaptively determines thresholds for tail probabilities in dependent data sequences, improving efficiency and robustness in tasks like change-point detection and extreme bandits.
Contribution
It proposes a novel, distribution-free method for threshold selection that accounts for data dependence using extremal index estimation, requiring only one sample path.
Findings
DTM accurately estimates thresholds in dependent data scenarios.
It outperforms Monte Carlo methods in efficiency and robustness.
Numerical experiments demonstrate strong performance across various settings.
Abstract
We present a novel distribution-free approach, the data-driven threshold machine (DTM), for a fundamental problem at the core of many learning tasks: choose a threshold for a given pre-specified level that bounds the tail probability of the maximum of a (possibly dependent but stationary) random sequence. We do not assume data distribution, but rather relying on the asymptotic distribution of extremal values, and reduce the problem to estimate three parameters of the extreme value distributions and the extremal index. We specially take care of data dependence via estimating extremal index since in many settings, such as scan statistics, change-point detection, and extreme bandits, where dependence in the sequence of statistics can be significant. Key features of our DTM also include robustness and the computational efficiency, and it only requires one sample path to form a reliable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Advanced Bandit Algorithms Research · Advanced Statistical Process Monitoring
