Hardness of Regular Expression Matching with Extensions
Taisei Nogami, Yoshiki Nakamura, Tachio Terauchi

TL;DR
This paper establishes new computational lower bounds for regular expression matching with extensions, showing that many such problems cannot be solved significantly faster than existing algorithms under common complexity hypotheses.
Contribution
It proves novel time complexity lower bounds for regex extensions like backreference, intersection, and complement, explaining the optimality of existing algorithms.
Findings
No sub-quadratic algorithms for regex with extensions under OVC.
Extended regex matching with complement is nearly optimal based on lower bounds.
The classical $O(n^3 m)$ algorithm for extended regex matching is essentially optimal.
Abstract
The regular expression matching problem asks whether a given regular expression of length matches a given string of length . As is well known, the problem can be solved in time using Thompson's algorithm. Moreover, recent studies have shown that regular expression matching extended with a practical extension called lookaround can be solved in the same time complexity. In this work, we consider four well-known extensions to regular expressions called backreference, squaring, intersection and complement. We prove a number of novel time complexity lower bounds for regular expression matching with these extensions under the Orthogonal Vectors Conjecture (OVC), -OVC, -Clique Hypothesis, and Combinatorial -Clique Hypothesis. Some highlights of our results include the fact that none of the matching problems with the extensions can be solved in $n^{2-\varepsilon}…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
