On Stabbing Queries for Generalized Longest Repeat
Bojian Xu

TL;DR
This paper extends longest repeat queries from point to interval queries, enabling efficient retrieval of all longest repeats covering any interval in a string, with optimal construction and query times, improving usability in applications like computational biology.
Contribution
It introduces a novel indexing structure that supports interval longest repeat queries in optimal time and space, a significant advancement over previous point query methods.
Findings
Supports interval queries in O(1) time after O(n) preprocessing
Finds all longest repeats covering an interval in optimal O(occ) time
Experiments show competitive performance with prior methods
Abstract
A longest repeat query on a string, motivated by its applications in many subfields including computational biology, asks for the longest repetitive substring(s) covering a particular string position (point query). In this paper, we extend the longest repeat query from point query to \emph{interval query}, allowing the search for longest repeat(s) covering any position interval, and thus significantly improve the usability of the solution. Our method for interval query takes a different approach using the insight from a recent work on \emph{shortest unique substrings} [1], as the prior work's approach for point query becomes infeasible in the setting of interval query. Using the critical insight from [1], we propose an indexing structure, which can be constructed in the optimal time and space for a string of size , such that any future interval query can be answered in …
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · DNA and Biological Computing · Network Packet Processing and Optimization
