Rank distributions of words in additive many-step Markov chains and the Zipf law
K. E. Kechedzhy O.V. Usatenko, and V. A. Yampol'skii

TL;DR
This paper analyzes the rank distributions of words in additive many-step Markov chains, demonstrating the Zipf law's validity under certain correlation conditions and revealing self-similarity properties.
Contribution
It provides a theoretical proof that the rank distribution envelope follows a power law in strongly correlated Markov chains and shows Zipf law applicability for short words.
Findings
Envelope curve obeys power law with exponent near unity
Zipf law valid for words shorter than correlation length
Self-similarity observed in rank distribution under decimation
Abstract
The binary many-step Markov chain with the step-like memory function is considered as a model for the analysis of rank distributions of words in stochastic symbolic dynamical systems. We prove that the envelope curve for this distribution obeys the power law with the exponent of the order of unity in the case of rather strong persistent correlations. The Zipf law is shown to be valid for the rank distribution of words with lengths about and shorter than the correlation length in the Markov sequence. A self-similarity in the rank distribution with respect to the decimation procedure is observed.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProtein Structure and Dynamics · Fractal and DNA sequence analysis · Stochastic processes and statistical mechanics
