Longest common subsequences between words of very unequal length
Boris Bukh, Zichao Dong

TL;DR
This paper investigates the expected length of the longest common subsequence between two words of very different lengths over a k-symbol alphabet, revealing how this expectation scales with parameters and providing bounds for large alphabets.
Contribution
It establishes the asymptotic behavior of the LCS length constant b3_{k,b5} and shows it is approximately 1 minus a quadratic function of b5, with evidence for its limit as k grows large.
Findings
b3_{k,b5} is of order 1 - cb5^2 uniformly in k and b5
For large k, b3_{k,b5} approaches 1 - frac{1}{4}b5^2
A matching lower bound for b3_{k,b5} is proved
Abstract
We consider the expected length of the longest common subsequence between two random words of lengths and over -symbol alphabet. It is well-known that this quantity is asymptotic to for some constant . We show that is of the order uniformly in and . In addition, for large , we give evidence that approaches , and prove a matching lower bound.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRandom Matrices and Applications · Advanced Combinatorial Mathematics · Coding theory and cryptography
