Density dichotomy in random words
Joshua Cooper, Danny Rorabaugh

TL;DR
This paper investigates the density of homomorphic images of words within random words, establishing a dichotomy based on whether the word is doubled, and explores convergence and concentration properties.
Contribution
It introduces a density dichotomy for words in random sequences, linking the property of being doubled to the asymptotic behavior of homomorphic image density.
Findings
Doubled words have density tending to zero in large random words.
Non-doubled words exhibit different convergence behaviors.
Concentration results describe the distribution of densities for doubled words.
Abstract
Word is said to encounter word provided there is a homomorphism mapping letters to nonempty words so that is a substring of . For example, taking such that and , we see that "science" encounters "huh" since . The density of in , , is the proportion of substrings of that are homomorphic images of . So the density of "huh" in "science" is . A word is doubled if every letter that appears in the word appears at least twice. The dichotomy: Let be a word over any alphabet, a finite alphabet with at least 2 letters, and chosen uniformly at random. Word is doubled if and only if as . We further explore convergence for nondoubled words and concentration of the limit…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topicssemigroups and automata theory · DNA and Biological Computing · Authorship Attribution and Profiling
