Membership Testing for Semantic Regular Expressions
Yifei Huang, Matin Amini, Alexis Le Glaunec, Konstantinos Mamouras,, Mukund Raghothaman

TL;DR
This paper introduces an efficient NFA-based algorithm for membership testing of semantic regular expressions involving external oracles, with theoretical analysis, experimental validation, and insights into computational complexity and oracle query costs.
Contribution
It presents the first practical algorithm for SemRE membership testing, analyzes its complexity, and explores the connection to graph theory and oracle query minimization.
Findings
The algorithm runs in quadratic time relative to expression and string size.
Experimental results show the algorithm outperforms baseline methods.
A lower bound on oracle queries necessary for membership testing is established.
Abstract
SMORE (Chen et al., 2023) recently proposed the concept of semantic regular expressions that extend the classical formalism with a primitive to query external oracles such as databases and large language models (LLMs). Such patterns can be used to identify lines of text containing references to semantic concepts such as cities, celebrities, political entities, etc. The focus in their paper was on automatically synthesizing semantic regular expressions from positive and negative examples. In this paper, we study the membership testing problem: First, We present a two-pass NFA-based algorithm to determine whether a string matches a semantic regular expression (SemRE) in time, assuming the oracle responds to each query in unit time. In common situations, where oracle queries are not nested, we show that this procedure runs in time.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
