TL;DR
This paper introduces a scalable greedy algorithm for segmenting multivariate time series into Gaussian-distributed segments, enabling efficient analysis of large datasets with high-dimensional data.
Contribution
The paper presents GGS, a linear-time heuristic for Gaussian segmentation that guarantees local optimality and scales to high-dimensional, long time series.
Findings
GGS efficiently segments high-dimensional financial and text data.
The method scales linearly with time series length.
It provides a locally optimal segmentation solution.
Abstract
We consider the problem of breaking a multivariate (vector) time series into segments over which the data is well explained as independent samples from a Gaussian distribution. We formulate this as a covariance-regularized maximum likelihood problem, which can be reduced to a combinatorial optimization problem of searching over the possible breakpoints, or segment boundaries. This problem can be solved using dynamic programming, with complexity that grows with the square of the time series length. We propose a heuristic method that approximately solves the problem in linear time with respect to this length, and always yields a locally optimal choice, in the sense that no change of any one breakpoint improves the objective. Our method, which we call greedy Gaussian segmentation (GGS), easily scales to problems with vectors of dimension over 1000 and time series of arbitrary length. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
