Optimal and fast detection of spatial clusters with scan statistics
Guenther Walther

TL;DR
This paper presents a method for detecting multivariate spatial clusters using scan statistics that is both statistically optimal and computationally efficient, capable of handling large datasets with weakly dependent marginals.
Contribution
It introduces a novel calibration approach for scan statistics based on grouping windows by size, achieving optimal inference and near-linear computational complexity.
Findings
Achieves statistical optimality for small and large scale clusters.
Provides an efficient approximation method for scan windows.
Computational complexity is nearly linear in the number of locations.
Abstract
We consider the detection of multivariate spatial clusters in the Bernoulli model with locations, where the design distribution has weakly dependent marginals. The locations are scanned with a rectangular window with sides parallel to the axes and with varying sizes and aspect ratios. Multivariate scan statistics pose a statistical problem due to the multiple testing over many scan windows, as well as a computational problem because statistics have to be evaluated on many windows. This paper introduces methodology that leads to both statistically optimal inference and computationally efficient algorithms. The main difference to the traditional calibration of scan statistics is the concept of grouping scan windows according to their sizes, and then applying different critical values to different groups. It is shown that this calibration of the scan statistic results in optimal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
