A Central Limit Theorem for the Length of the Longest Common Subsequences in Random Words
Christian Houdr\'e, \"Umit I\c{s}lak

TL;DR
This paper proves a central limit theorem for the length of the longest common subsequences in two independent random words with i.i.d. letters, contrasting with the Tracy-Widom distribution for permutations.
Contribution
It establishes a CLT for $LC_n$ under variance assumptions, providing new insight into the distribution of longest common subsequences in random words.
Findings
$LC_n$ satisfies a central limit theorem under certain conditions.
Contrasts the CLT result with Tracy-Widom distribution for permutations.
Provides conditions on variance growth for CLT applicability.
Abstract
Let and be two independent sequences of independent identically distributed random variables taking their values in a common finite alphabet and having the same law. Let be the length of the longest common subsequences of the two random words and . Under a lower bound assumption on the order of its variance, is shown to satisfy a central limit theorem. This is in contrast to the limiting distribution of the length of the longest common subsequences in two independent uniform random permutations of , which is shown to be the Tracy-Widom distribution.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Algorithms and Data Compression · DNA and Biological Computing
