Sequential Change-point Detection for High-dimensional and non-Euclidean Data
Lynna Chu, Hao Chen

TL;DR
This paper introduces a non-parametric, nearest neighbor-based method for online change-point detection in high-dimensional and non-Euclidean data, improving detection accuracy while controlling false alarms.
Contribution
It proposes new test statistics for anomaly detection that are effective across arbitrary dimensions and data types, with analytic formulas for quick application.
Findings
New test statistics outperform existing methods in detection accuracy.
Analytic formulas enable fast computation suitable for large datasets.
Method successfully applied to NYC taxi data for anomaly detection.
Abstract
In many applications, it is often of practical and scientific interest to detect anomaly events in a streaming sequence of high-dimensional or non-Euclidean observations. We study a non-parametric framework that utilizes nearest neighbor information among the observations to detect changes in an online setting. It can be applied to data in arbitrary dimension and non-Euclidean data as long as a similarity measure on the sample space can be defined. We consider new test statistics under this framework that can detect anomaly events more effectively than the existing test while keeping the false discovery rate controlled at a fixed level. Analytic formulas approximating the average run lengths of the new approaches are derived to make them fast applicable to modern datasets. Simulation studies are provided to support theoretical results. The proposed approach is illustrated with an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
