K-ARMA Models for Clustering Time Series Data
Derek O. Hoare, David S. Matteson, and Martin T. Wells

TL;DR
This paper introduces K-ARMA models for clustering time series data, providing a model-based extension of K-Means, with convergence proofs, robustness to outliers, and applications to real and simulated datasets.
Contribution
It develops a novel clustering algorithm for ARMA and ARIMA models, proving its convergence and robustness, and demonstrates its effectiveness on real and simulated data.
Findings
Algorithm is robust to outliers
Effective in detecting distributional drift
Competitive with existing clustering methods
Abstract
We present an approach to clustering time series data using a model-based generalization of the K-Means algorithm which we call K-Models. We prove the convergence of this general algorithm and relate it to the hard-EM algorithm for mixture modeling. We then apply our method first with an AR() clustering example and show how the clustering algorithm can be made robust to outliers using a least-absolute deviations criteria. We then build our clustering algorithm up for ARMA() models and extend this to ARIMA() models. We develop a goodness of fit statistic for the models fitted to clusters based on the Ljung-Box statistic. We perform experiments with simulated data to show how the algorithm can be used for outlier detection, detecting distributional drift, and discuss the impact of initialization method on empty clusters. We also perform experiments on real data which show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Time Series Analysis and Forecasting · Data Stream Mining Techniques
