Neuro-Symbolic Regex Synthesis Framework via Neural Example Splitting
Su-Hyeon Kim, Hyunjoon Cheon, Yo-Sub Han, Sang-Ki Ko

TL;DR
This paper introduces SplitRegex, a neural-guided divide-and-conquer framework for faster and more accurate regex synthesis from string examples, outperforming previous methods on benchmark datasets.
Contribution
The paper proposes a novel neural example splitting approach and a regex synthesis framework that improves speed and accuracy over existing methods.
Findings
Significant improvement over previous regex synthesis methods.
Effective division of positive strings enhances learning accuracy.
Framework successfully handles negative string constraints.
Abstract
Due to the practical importance of regular expressions (regexes, for short), there has been a lot of research to automatically generate regexes from positive and negative string examples. We tackle the problem of learning regexes faster from positive and negative strings by relying on a novel approach called `neural example splitting'. Our approach essentially split up each example string into multiple parts using a neural network trained to group similar substrings from positive strings. This helps to learn a regex faster and, thus, more accurately since we now learn from several short-length strings. We propose an effective regex synthesis framework called `SplitRegex' that synthesizes subregexes from `split' positive substrings and produces the final regex by concatenating the synthesized subregexes. For the negative sample, we exploit pre-generated subregexes during the subregex…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Handwritten Text Recognition Techniques
