TransRegex: Multi-modal Regular Expression Synthesis by Generate-and-Repair
Yeting Li, Shuaimin Li, Zhiwu Xu, Jialun Cao, Zixuan Chen, and Yun Hu, Haiming Chen, Shing-Chi Cheung

TL;DR
TransRegex is a novel approach that combines natural language and example inputs to automatically generate and repair regular expressions, significantly improving accuracy over existing methods.
Contribution
It introduces a new NLP-and-example-based regex synthesis framework with algorithms for both generation and repair, outperforming state-of-the-art tools.
Findings
TransRegex achieves 17.4% to 38.9% higher accuracy than NLP-based methods.
It outperforms existing multi-modal techniques by 10% to 30%.
Utilizing natural language and examples more effectively improves regex synthesis.
Abstract
Since regular expressions (abbrev. regexes) are difficult to understand and compose, automatically generating regexes has been an important research problem. This paper introduces TransRegex, for automatically constructing regexes from both natural language descriptions and examples. To the best of our knowledge, TransRegex is the first to treat the NLP-and-example-based regex synthesis problem as the problem of NLP-based synthesis with regex repair. For this purpose, we present novel algorithms for both NLP-based synthesis and regex repair. We evaluate TransRegex with ten relevant state-of-the-art tools on three publicly available datasets. The evaluation results demonstrate that the accuracy of our TransRegex is 17.4%, 35.8% and 38.9% higher than that of NLP-based approaches on the three datasets, respectively. Furthermore, TransRegex can achieve higher accuracy than the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Software Engineering Research
