Detection and Evaluation of Clusters within Sequential Data
Alexander Van Werde, Albert Senen-Cerda, Gianluca Kosmella, Jaron, Sanders

TL;DR
This paper evaluates clustering algorithms based on Block Markov Chains for real-world sequential data, demonstrating their practical utility and developing new evaluation tools for complex, sparse datasets.
Contribution
It provides a thorough empirical assessment of Block Markov Chain clustering algorithms and introduces new evaluation tools for real-world sequential data analysis.
Findings
Block Markov Chain clustering yields meaningful insights in diverse real-world data.
The developed evaluation tools effectively assess clustering quality and model fit.
Algorithms perform well even with sparse and complex data.
Abstract
Motivated by theoretical advancements in dimensionality reduction techniques we use a recent model, called Block Markov Chains, to conduct a practical study of clustering in real-world sequential data. Clustering algorithms for Block Markov Chains possess theoretical optimality guarantees and can be deployed in sparse data regimes. Despite these favorable theoretical properties, a thorough evaluation of these algorithms in realistic settings has been lacking. We address this issue and investigate the suitability of these clustering algorithms in exploratory data analysis of real-world sequential data. In particular, our sequential data is derived from human DNA, written text, animal movement data and financial markets. In order to evaluate the determined clusters, and the associated Block Markov Chain model, we further develop a set of evaluation tools. These tools include…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Complex Network Analysis Techniques · Bayesian Modeling and Causal Inference
