Asymptotic distribution-free change-point detection for data with repeated observations
Hoseung Song, Hao Chen

TL;DR
This paper extends graph-based change-point detection methods to handle repeated observations, common in discrete data, by averaging or union of optimal graphs, enabling fast, accurate detection in high-dimensional and network data.
Contribution
It introduces a novel approach to manage repeated observations in graph-based change-point detection, with analytic error control formulas and practical implementation in R.
Findings
Effective detection of change-points in network data.
Analytic formulas for controlling type I error.
Implementation available in R package gSeg.
Abstract
In the regime of change-point detection, a nonparametric framework based on scan statistics utilizing graphs representing similarities among observations is gaining attention due to its flexibility and good performances for high-dimensional and non-Euclidean data sequences, which are ubiquitous in this big data era. However, this graph-based framework encounters problems when there are repeated observations in the sequence, which often happens for discrete data, such as network data. In this work, we extend the graph-based framework to solve this problem by averaging or taking union of all possible optimal graphs resulted from repeated observations. We consider both the single change-point alternative and the changed-interval alternative, and derive analytic formulas to control the type I error for the new methods, making them fast applicable to large datasets. The extended methods are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBioinformatics and Genomic Networks · Genetic Associations and Epidemiology · Statistical Methods and Inference
