Near-Optimal Clustering in Mixture of Markov Chains

Junghyun Lee; Yassir Jedra; Alexandre Prouti\`ere; Se-Young Yun

arXiv:2506.01324·stat.ML·March 18, 2026

Near-Optimal Clustering in Mixture of Markov Chains

Junghyun Lee, Yassir Jedra, Alexandre Prouti\`ere, Se-Young Yun

PDF

Open Access

TL;DR

This paper introduces a near-optimal clustering algorithm for trajectories generated by unknown ergodic Markov chains, leveraging a novel spectral embedding and likelihood refinement, with theoretical guarantees and preliminary experimental validation.

Contribution

The paper presents a new spectral embedding method for ergodic Markov chains and demonstrates near-optimal clustering performance with theoretical error bounds.

Findings

01

Achieves near-optimal clustering error with high probability.

02

Introduces a new injective Euclidean embedding for Markov chains.

03

Provides theoretical bounds based on stationary-weighted KL divergence.

Abstract

We study the problem of clustering $T$ trajectories of length $H$ , each generated by one of K unknown ergodic Markov chains over a finite state space of size $S$ . We derive an instance-dependent, high-probability lower bound on the clustering error rate, governed by the stationary-weighted KL divergence between transition kernels. We then propose a two-stage algorithm: Stage I applies spectral clustering via a new injective Euclidean embedding for ergodic Markov chains, a contribution of independent interest enabling sharp concentration results; Stage II refines clusters with a single likelihood-based reassignment step. We prove that our algorithm achieves near-optimal clustering error with high probability under reasonable requirements on $T$ and $H$ . Preliminary experiments support our approach, and we conclude with discussions of its limitations and extensions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Clustering Algorithms Research · Bayesian Methods and Mixture Models

MethodsSpectral Clustering