Does Preprocessing help in Fast Sequence Comparisons?
Elazar Goldenberg, Aviad Rubinstein, Barna Saha

TL;DR
This paper investigates how preprocessing can significantly improve the efficiency of computing exact and approximate edit distances between strings, especially in scenarios involving many comparisons, leading to new faster algorithms.
Contribution
It introduces novel preprocessing-based algorithms for exact and approximate edit distance computation, outperforming previous methods and enabling faster comparisons in large string pools.
Findings
Exact permutation-LCS computation with $O(n \,\log n)$ preprocessing
Exact edit distance for small $k$ with $O(n \,\log n)$ preprocessing
Approximate edit distance within factor $(7+o(1))$ with subquadratic time
Abstract
We study edit distance computation with preprocessing: the preprocessing algorithm acts on each string separately, and then the query algorithm takes as input the two preprocessed strings. This model is inspired by scenarios where we would like to compute edit distance between many pairs in the same pool of strings. Our results include: Permutation-LCS: If the LCS between two permutations has length , we can compute it \textit{ exactly} with preprocessing and query time. Small edit distance: For general strings, if their edit distance is at most , we can compute it \textit{ exactly} with preprocessing and query time. Approximate edit distance: For the most general input, we can approximate the edit distance to within factor with preprocessing time and query time…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Complexity and Algorithms in Graphs · semigroups and automata theory
