Longest Common Subsequence: Tabular vs. Closed-Form Equation Computation of Subsequence Probability
Alireza Abdi, Mohsen Hooshmand

TL;DR
This paper introduces a novel closed-form equation for probabilistic table calculation in the Longest Common Subsequence problem, enabling improved analysis and heuristics that outperform existing methods.
Contribution
It presents the first closed-form equation for probabilistic table computation in LCS, enhancing analysis and heuristic development.
Findings
Proposed methods outperform state-of-the-art LCS algorithms.
Introduced a new heuristic based on the Coefficient of Variation.
Developed an analytic approach for estimating remaining subsequence length.
Abstract
The Longest Common Subsequence Problem (LCS) deals with finding the longest subsequence among a given set of strings. The LCS problem is an NP-hard problem which makes it a target for lots of effort to find a better solution with heuristics methods. The baseline for most famous heuristics functions is a tabular random, probabilistic approach. This approach approximates the length of the LCS in each iteration. The combination of beam search and tabular probabilistic-based heuristics has led to a large number of proposals and achievements in algorithms for solving the LCS problem. In this work, we introduce a closed-form equation of the probabilistic table calculation for the first time. Moreover, we present other corresponding forms of the closed-form equation and prove all of them. The closed-form equation opens new ways for analysis and further approximations. Using the theorems and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Data Management and Algorithms · Web Data Mining and Analysis
