Long twins in random words
Andrzej Dudek, Jaros{\l}aw Grytczuk, Andrzej Ruci\'nski

TL;DR
This paper studies the maximum length of identical disjoint subwords (twins) in random words over finite alphabets, providing improved probabilistic bounds that surpass previous deterministic estimates for small alphabet sizes.
Contribution
It establishes new probabilistic lower bounds for the length of twins in random words, improving upon known deterministic bounds especially for small alphabet sizes.
Findings
In a random ternary word, twins of length at least 0.41n exist with high probability.
For alphabets of size k ≥ 3, lower bounds of approximately 1.64/(k+1) n are achieved.
Results extend to multiple twins, showing similar probabilistic bounds.
Abstract
Twins in a finite word are formed by a pair of identical subwords placed at disjoint sets of positions. We investigate the maximum length of twins in a random word over a -letter alphabet. The obtained lower bounds for small values of significantly improve the best estimates known in the deterministic case. Bukh and Zhou in 2016 showed that every ternary word of length contains twins of length at least . Our main result states that in a random ternary word of length , with high probability, one can find twins of length at least . In the general case of alphabets of size we obtain analogous lower bounds of the form which are better than the known deterministic bounds for . In addition, we present similar results for multiple twins in random words.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
