TL;DR
This paper introduces algorithms for a generalized edit distance that interpolates between Hamming and classical edit distances, providing efficient approximation methods especially when substitutions are cheaper than indels.
Contribution
It presents a new parameterized edit distance model, along with near-optimal exact and approximation algorithms that outperform traditional methods in certain regimes.
Findings
A simple deterministic exact algorithm for ED_a is near-optimal under the Orthogonal Vectors Conjecture.
A randomized $(1+ta)$-approximation algorithm for ED_a runs in sublinear time for many parameters.
An algorithm for the $(k_I,k_S)$-alignment problem offers a bicriteria approximation with improved runtime.
Abstract
The edit distance between strings classically assigns unit cost to every character insertion, deletion, and substitution, whereas the Hamming distance only allows substitutions. In many real-life scenarios, insertions and deletions (abbreviated indels) appear frequently but significantly less so than substitutions. To model this, we consider substitutions being cheaper than indels, with cost for a parameter . This basic variant, denoted , bridges classical edit distance () with Hamming distance (), leading to interesting algorithmic challenges: Does the time complexity of computing interpolate between that of Hamming distance (linear time) and edit distance (quadratic time)? What about approximating ? We first present a simple deterministic exact algorithm for and further prove that it is near-optimal assuming the Orthogonal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
An Algorithmic Bridge Between Hamming and Levenshtein Distances· youtube
