On Hardness of Jumbled Indexing
Amihood Amir, Timothy Chan, Moshe Lewenstein, Noa Lewenstein

TL;DR
This paper demonstrates that, assuming 3SUM-hardness, efficient jumbled indexing with sub-quadratic preprocessing or query time is unlikely, especially for larger alphabets, explaining the difficulty of improving existing algorithms.
Contribution
The paper establishes conditional lower bounds on preprocessing and query times for jumbled indexing based on 3SUM-hardness assumptions, highlighting inherent computational challenges.
Findings
Under 3SUM-hardness, preprocessing must be nearly quadratic or query time nearly linear.
For fixed small alphabets, similar lower bounds apply, indicating fundamental complexity barriers.
Provides theoretical evidence explaining the stagnation in improving jumbled indexing algorithms.
Abstract
Jumbled indexing is the problem of indexing a text for queries that ask whether there is a substring of matching a pattern represented as a Parikh vector, i.e., the vector of frequency counts for each character. Jumbled indexing has garnered a lot of interest in the last four years. There is a naive algorithm that preprocesses all answers in time allowing quick queries afterwards, and there is another naive algorithm that requires no preprocessing but has query time. Despite a tremendous amount of effort there has been little improvement over these running times. In this paper we provide good reason for this. We show that, under a 3SUM-hardness assumption, jumbled indexing for alphabets of size requires preprocessing time or query time for any . In fact, under a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Music and Audio Processing · Machine Learning and Algorithms
