Learning Mixtures of Markov Chains and MDPs

Chinmaya Kausik; Kevin Tan; Ambuj Tewari

arXiv:2211.09403·stat.ML·February 7, 2023

Learning Mixtures of Markov Chains and MDPs

Chinmaya Kausik, Kevin Tan, Ambuj Tewari

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a comprehensive algorithm for learning mixtures of Markov chains and MDPs from short, unlabeled trajectories, with theoretical guarantees and superior empirical performance over existing methods.

Contribution

The paper proposes a novel multi-step algorithm combining subspace estimation, spectral clustering, EM refinement, and classification for learning mixture models of Markov chains and MDPs.

Findings

01

Achieves 96.6% accuracy on gridworld MDP mixture

02

Provides end-to-end guarantees with linear trajectory length

03

Outperforms EM with random initialization in experiments

Abstract

We present an algorithm for learning mixtures of Markov chains and Markov decision processes (MDPs) from short unlabeled trajectories. Specifically, our method handles mixtures of Markov chains with optional control input by going through a multi-step process, involving (1) a subspace estimation step, (2) spectral clustering of trajectories using "pairwise distance estimators," along with refinement using the EM algorithm, (3) a model estimation step, and (4) a classification step for predicting labels of new trajectories. We provide end-to-end performance guarantees, where we only explicitly require the length of trajectories to be linear in the number of states and the number of trajectories to be linear in a mixing time parameter. Experimental results support these guarantees, where we attain 96.6% average accuracy on a mixture of two MDPs in gridworld, outperforming the EM algorithm…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hetankevin/mdpmix
noneOfficial

Videos

Learning Mixtures of Markov Chains and MDPs· slideslive

Taxonomy

TopicsBayesian Modeling and Causal Inference · Time Series Analysis and Forecasting · Bayesian Methods and Mixture Models

MethodsSpectral Clustering