Near-Optimal Clustering in Mixture of Markov Chains
Junghyun Lee, Yassir Jedra, Alexandre Prouti\`ere, Se-Young Yun

TL;DR
This paper introduces a near-optimal clustering algorithm for trajectories generated by unknown ergodic Markov chains, leveraging a novel spectral embedding and likelihood refinement, with theoretical guarantees and preliminary experimental validation.
Contribution
The paper presents a new spectral embedding method for ergodic Markov chains and demonstrates near-optimal clustering performance with theoretical error bounds.
Findings
Achieves near-optimal clustering error with high probability.
Introduces a new injective Euclidean embedding for Markov chains.
Provides theoretical bounds based on stationary-weighted KL divergence.
Abstract
We study the problem of clustering trajectories of length , each generated by one of K unknown ergodic Markov chains over a finite state space of size . We derive an instance-dependent, high-probability lower bound on the clustering error rate, governed by the stationary-weighted KL divergence between transition kernels. We then propose a two-stage algorithm: Stage I applies spectral clustering via a new injective Euclidean embedding for ergodic Markov chains, a contribution of independent interest enabling sharp concentration results; Stage II refines clusters with a single likelihood-based reassignment step. We prove that our algorithm achieves near-optimal clustering error with high probability under reasonable requirements on and . Preliminary experiments support our approach, and we conclude with discussions of its limitations and extensions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Bayesian Methods and Mixture Models
MethodsSpectral Clustering
