Cats & Co: Categorical Time Series Coclustering
Dominique Gay, Romain Guigour\`es, Marc Boull\'e, Fabrice Cl\'erot

TL;DR
This paper introduces a novel Bayesian data grid model for clustering categorical time series data, enabling effective exploration and interpretation of temporal event sequences through nonparametric joint distribution estimation.
Contribution
The paper presents a new 3D coclustering method based on data grid models for categorical time series, with a parameter-free Bayesian approach for optimal grid selection.
Findings
Efficient and effective clustering of categorical time series.
Discovery of meaningful temporal event patterns.
Versatile visualization and interpretation tools.
Abstract
We suggest a novel method of clustering and exploratory analysis of temporal event sequences data (also known as categorical time series) based on three-dimensional data grid models. A data set of temporal event sequences can be represented as a data set of three-dimensional points, each point is defined by three variables: a sequence identifier, a time value and an event value. Instantiating data grid models to the 3D-points turns the problem into 3D-coclustering. The sequences are partitioned into clusters, the time variable is discretized into intervals and the events are partitioned into clusters. The cross-product of the univariate partitions forms a multivariate partition of the representation space, i.e., a grid of cells and it also represents a nonparametric estimator of the joint distribution of the sequences, time and events dimensions. Thus, the sequences are grouped…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Complex Systems and Time Series Analysis
