Asymptotic behavior of some factorizations of random words
Elahe Zohoorian Azad, Philippe Chassaing (IECL)

TL;DR
This paper investigates the asymptotic distribution of factor lengths in random words, revealing connections to stick-breaking and Poisson-Dirichlet distributions, and characterizes the limit behavior of Lyndon word factorizations.
Contribution
It establishes the limit laws for normalized factor lengths in Lyndon and standard factorizations of random words, linking them to well-known probabilistic distributions.
Findings
Normalized lengths of smallest Lyndon factors follow a stick-breaking process.
Distribution of longest factor lengths converges to Poisson-Dirichlet distribution.
Normalized length of standard right factor converges to a mixture distribution involving the smallest letter probability.
Abstract
In this paper we consider the normalized lengths of the factors of some factorizations of random words. First, for the \emph{Lyndon factorization} of finite random words with independent letters drawn from a finite or infinite totally ordered alphabet according to a general probability distribution, we prove that the limit law of the normalized lengths of the smallest Lyndon factors is a variant of the stickbreaking process. Convergence of the distribution of the lengths of the longest factors to a Poisson-Dirichlet distribution follows. Secondly we consider the \emph{standard factorization} of random \emph{Lyndon word} : we prove that the distribution of the normalized length of the standard right factor of a random -letters long Lyndon word, derived from such an alphabet, converges, when is large, to: in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRandom Matrices and Applications · semigroups and automata theory · Bayesian Methods and Mixture Models
