Fast Sketch-based Recovery of Correlation Outliers
Graham Cormode, Jacques Dark

TL;DR
This paper introduces a fast, sketch-based algorithm for identifying highly correlated pairs in large, streaming time-series data, significantly reducing computational complexity compared to traditional methods.
Contribution
The authors develop a novel sketching algorithm combined with coding theory techniques to efficiently detect correlation outliers in streaming data, outperforming existing LSH-based methods.
Findings
Algorithm accurately identifies large correlation pairs.
Significant reduction in computational cost.
Proven correctness and improved performance over LSH methods.
Abstract
Many data sources can be interpreted as time-series, and a key problem is to identify which pairs out of a large collection of signals are highly correlated. We expect that there will be few, large, interesting correlations, while most signal pairs do not have any strong correlation. We abstract this as the problem of identifying the highly correlated pairs in a collection of n mostly pairwise uncorrelated random variables, where observations of the variables arrives as a stream. Dimensionality reduction can remove dependence on the number of observations, but further techniques are required to tame the quadratic (in n) cost of a search through all possible pairs. We develop a new algorithm for rapidly finding large correlations based on sketch techniques with an added twist: we quickly generate sketches of random combinations of signals, and use these in concert with ideas from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
