Learning Mixtures of Markov Chains with Quality Guarantees
Fabian Spaeh, Charalampos E. Tsourakakis

TL;DR
This paper improves algorithms for learning mixtures of Markov chains from trail data, addressing practical issues like disconnected chains and unknown number of components, and demonstrates superior performance through experiments.
Contribution
It introduces an algebraic criterion for selecting the number of chains and a robust reconstruction algorithm that handles disconnected chains and noise.
Findings
Outperforms the original GKV-SVD algorithm on synthetic and real data.
Combining our method with EM yields the best practical results.
Our approach accurately recovers the true mixture in challenging scenarios.
Abstract
A large number of modern applications ranging from listening songs online and browsing the Web to using a navigation app on a smartphone generate a plethora of user trails. Clustering such trails into groups with a common sequence pattern can reveal significant structure in human behavior that can lead to improving user experience through better recommendations, and even prevent suicides [LMCR14]. One approach to modeling this problem mathematically is as a mixture of Markov chains. Recently, Gupta, Kumar and Vassilvitski [GKV16] introduced an algorithm (GKV-SVD) based on the singular value decomposition (SVD) that under certain conditions can perfectly recover a mixture of L chains on n states, given only the distribution of trails of length 3 (3-trail). In this work we contribute to the problem of unmixing Markov chains by highlighting and addressing two important constraints of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Bayesian Methods and Mixture Models · Machine Learning and Algorithms
