Lower bounds for text indexing with mismatches and differences
Vincent Cohen-Addad (LIP6), Laurent Feuilloley (IRIF), Tatiana, Starikovskaya (DI-ENS)

TL;DR
This paper establishes theoretical lower bounds for text indexing problems with mismatches and differences, showing inherent computational hardness and limitations for efficient data structures, especially as the mismatch parameter grows.
Contribution
It provides the first conditional and pointer-machine lower bounds for text indexing with mismatches, especially for small and moderate values of k, explaining the difficulty of achieving efficient solutions.
Findings
Lower bounds for k = Θ(log n) under ETH
Polynomial-time data structures cannot have sublinear query time for certain k
Exponential dependency on k is unavoidable for small k in current models
Abstract
In this paper we study lower bounds for the fundamental problem of text indexing with mismatches and differences. In this problem we are given a long string of length , the "text", and the task is to preprocess it into a data structure such that given a query string , one can quickly identify substrings that are within Hamming or edit distance at most from . This problem is at the core of various problems arising in biology and text processing. While exact text indexing allows linear-size data structures with linear query time, text indexing with mismatches (or differences) seems to be much harder: All known data structures have exponential dependency on either in the space, or in the time bound. We provide conditional and pointer-machine lower bounds that make a step toward explaining this phenomenon. We start by demonstrating lower bounds for $k = \Theta(\log…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Complexity and Algorithms in Graphs · Computability, Logic, AI Algorithms
