On an alternative sequence comparison statistic of Steele
\"Um\.it I\c{s}lak, Alperen Y. \"Ozdemir

TL;DR
This paper analyzes Steele's string comparison statistic, providing asymptotic results for its moments and distribution in random words and permutations, offering an alternative to the longest common subsequence measure.
Contribution
It offers new asymptotic analyses of Steele's statistic and its variation, addressing variance issues in string similarity measures.
Findings
Derived moment asymptotics for Steele's statistic
Established distributional asymptotics in random models
Proposed an alternative to the longest common subsequence measure
Abstract
The purpose of this paper is to study a statistic that is used to compare the similarity between two strings, which is first introduced by Michael Steele in 1982. It was proposed as an alternative to the length of the longest common subsequences, for which the variance problem is still open. Our results include moment asymptotics and distributional asymptotics for Steele's statistic and a variation of it in random words and random permutations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Algorithms and Data Compression · Stochastic processes and statistical mechanics
