Range Non-Overlapping Indexing
Hagai Cohen, Ely Porat

TL;DR
This paper introduces new indexing solutions for efficiently reporting maximal non-overlapping pattern occurrences in a text, including a generalized range version, with optimized space and query time complexities.
Contribution
It presents novel indexing algorithms for non-overlapping and range non-overlapping pattern occurrence queries with improved space and query time bounds.
Findings
O(n) space and O(m + occ_{NO}) query time for non-overlapping indexing
O(n log^ε n) space and O(m + log log n + occ_{ij,NO}) query time for range non-overlapping indexing
Effective solutions for pattern occurrence reporting in large texts
Abstract
We study the non-overlapping indexing problem: Given a text T, preprocess it so that you can answer queries of the form: given a pattern P, report the maximal set of non-overlapping occurrences of P in T. A generalization of this problem is the range non-overlapping indexing where in addition we are given two indexes i,j to report the maximal set of non-overlapping occurrences between these two indexes. We suggest new solutions for these problems. For the non-overlapping problem our solution uses O(n) space with query time of O(m + occ_{NO}). For the range non-overlapping problem we propose a solution with O(n\log^\epsilon n) space for some 0<\epsilon<1 and O(m + \log\log n + occ_{ij,NO}) query time.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · semigroups and automata theory · DNA and Biological Computing
