Clustering sequence data with mixture Markov chains with covariates using multiple simplex constrained optimization routine (MSiCOR)
Priyam Das, Deborshee Sen, Debsurya De, Jue Hou, Zahra S. H. Abad, Nicole Kim, Zongqi Xia, Tianxi Cai

TL;DR
This paper introduces a novel global optimization method for mixture Markov models, improving sequence clustering accuracy, especially in medical data, by outperforming traditional EM algorithms.
Contribution
Develops a pattern search-based optimization routine for MMM likelihood maximization, enhancing clustering of sequence data with covariates, demonstrated on MS patient treatment sequences.
Findings
Proposed method outperforms EM in simulation studies.
Successfully clusters MS patients into 3 distinct groups.
Cluster-specific covariate summaries reveal patient differences.
Abstract
Mixture Markov Model (MMM) is a widely used tool to cluster sequences of events coming from a finite state-space. However the MMM likelihood being multi-modal, the challenge remains in its maximization. Although Expectation-Maximization (EM) algorithm remains one of the most popular ways to estimate the MMM parameters, however convergence of EM algorithm is not always guaranteed. Given the computational challenges in maximizing the mixture likelihood on the constrained parameter space, we develop a pattern search-based global optimization technique which can optimize any objective function on a collection of simplexes, which is eventually used to maximize MMM likelihood. This is shown to outperform other related global optimization techniques. In simulation experiments, the proposed method is shown to outperform the expectation-maximization (EM) algorithm in the context of MMM…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
