On a Class of Markov Order Estimators Based on PPM and Other Universal Codes
{\L}ukasz D\k{e}bowski

TL;DR
This paper introduces a modified class of Markov order estimators based on universal coding, which are almost surely consistent, asymptotically bounded, and useful for quantifying long memory in stationary ergodic processes.
Contribution
The paper proposes a new universal Markov order estimator that improves upon previous methods by ensuring consistency, asymptotic bounds, and applications to long memory quantification.
Findings
Universal Markov orders are almost surely consistent.
They are asymptotically bounded by the logarithm of string length over entropy rate.
Using PPM, they bound block mutual information, aiding long memory analysis.
Abstract
We investigate a class of estimators of the Markov order for stationary ergodic processes which form a slight modification of the constructions by Merhav, Gutman, and Ziv in 1989 as well as by Ryabko, Astola, and Malyutov in 2006 and 2016. All the considered estimators compare the estimate of the entropy rate given by a universal code with the empirical conditional entropy of a string and return the order for which the two quantities are approximately equal. However, our modification, which we call universal Markov orders, satisfies a few attractive properties, not shown by the mentioned authors for their original constructions. Firstly, the universal Markov orders are almost surely consistent, without any restrictions. Secondly, they are upper bounded asymptotically by the logarithm of the string length divided by the entropy rate. Thirdly, if we choose the Prediction by Partial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Cellular Automata and Applications · Fractal and DNA sequence analysis
