State-Space Dynamics Distance for Clustering Sequential Data
Dar\'io Garc\'ia-Garc\'ia, Emilio Parrado-Hern\'andez, Fernando, D\'iaz-de-Mar\'ia

TL;DR
This paper introduces a new similarity measure for clustering sequential data by constructing a shared state-space and comparing transition matrices, improving over existing methods in scalability and overfitting.
Contribution
It presents a novel state-space based distance measure that reduces overfitting and enhances scalability in sequence clustering tasks.
Findings
Effective on synthetic datasets
Outperforms existing methods in real-world data
Reduces overfitting and improves scalability
Abstract
This paper proposes a novel similarity measure for clustering sequential data. We first construct a common state-space by training a single probabilistic model with all the sequences in order to get a unified representation for the dataset. Then, distances are obtained attending to the transition matrices induced by each sequence in that state-space. This approach solves some of the usual overfitting and scalability issues of the existing semi-parametric techniques, that rely on training a model for each sequence. Empirical studies on both synthetic and real-world datasets illustrate the advantages of the proposed similarity measure for clustering sequences.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Gaussian Processes and Bayesian Inference · Anomaly Detection Techniques and Applications
