Markov chain order estimation with parametric significance tests of conditional mutual information
Maria Papapetrou, Dimitris Kugiumtzis

TL;DR
This paper introduces parametric significance tests for estimating the order of a Markov chain from symbol sequences, improving accuracy and computational efficiency over existing methods.
Contribution
It develops gamma and normal distribution-based parametric tests for conditional mutual information, providing a faster alternative to randomization tests for Markov chain order estimation.
Findings
Gamma distribution test outperforms other parametric tests.
Gamma test matches the accuracy of the randomization test.
Applicable to DNA sequences, showing practical usefulness.
Abstract
Besides the different approaches suggested in the literature, accurate estimation of the order of a Markov chain from a given symbol sequence is an open issue, especially when the order is moderately large. Here, parametric significance tests of conditional mutual information (CMI) of increasing order , , on a symbol sequence are conducted for increasing orders in order to estimate the true order of the underlying Markov chain. CMI of order is the mutual information of two variables in the Markov chain being time steps apart, conditioning on the intermediate variables of the chain. The null distribution of CMI is approximated with a normal and gamma distribution deriving analytic expressions of their parameters, and a gamma distribution deriving its parameters from the mean and variance of the normal distribution. The accuracy of order estimation is assessed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
