Optimal sequential detection in multi-stream data
Hock Peng Chan

TL;DR
This paper develops optimal methods for online detection of distribution changes in a small fraction of multiple data streams, analyzing how detection delay scales with the number of streams.
Contribution
It extends existing approaches with optimal modifications for detecting mean shifts, characterizing detection delay across different sparsity regimes.
Findings
Detection delay varies across three regimes based on the fraction of affected streams.
Optimal detection is achieved by summing detectability score transformations of partial or CUSUM scores.
Detection delay scales logarithmically or follows classical formulas depending on the sparsity domain.
Abstract
Consider a large number of detectors each generating a data stream. The task is to detect online, distribution changes in a small fraction of the data streams. Previous approaches to this problem include the use of mixture likelihood ratios and sum of CUSUMs. We provide here extensions and modifications of these approaches that are optimal in detecting normal mean shifts. We show how the (optimal) detection delay depends on the fraction of data streams undergoing distribution changes as the number of detectors goes to infinity. There are three detection domains. In the first domain for moderately large fractions, immediate detection is possible. In the second domain for smaller fractions, the detection delay grows logarithmically with the number of detectors, with an asymptotic constant extending those in sparse normal mixture detection. In the third domain for even smaller fractions,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
