A Simple Sublinear Algorithm for Gap Edit Distance
Joshua Brakensiek, Moses Charikar, Aviad Rubinstein

TL;DR
This paper presents a simple, efficient sublinear algorithm for the gap edit distance problem, distinguishing strings with small or large edit distances in near-linear time, advancing the understanding of approximate string similarity computations.
Contribution
The paper introduces a straightforward sublinear time algorithm for the gap edit distance problem, resolving an open question for the entire relevant range of parameters.
Findings
Runs in time $ ilde O(n/\sqrt{k})$ for the gap problem
Provides a $k$-vs-$k^2$ algorithm with $ ilde O(n)$ preprocessing and $ ilde O(n/k)$ query time
Improves query time over previous algorithms for the same problem
Abstract
We study the problem of estimating the edit distance between two -character strings. While exact computation in the worst case is believed to require near-quadratic time, previous work showed that in certain regimes it is possible to solve the following {\em gap edit distance} problem in sub-linear time: distinguish between inputs of distance and . Our main result is a very simple algorithm for this benchmark that runs in time , and in particular settles the open problem of obtaining a truly sublinear time for the entire range of relevant . Building on the same framework, we also obtain a -vs- algorithm for the one-sided preprocessing model with preprocessing time and query time (improving over a recent -query time algorithm for the same problem [GRS'20].
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Machine Learning and Algorithms · Network Packet Processing and Optimization
