On the Uniform Distribution of Regular Expressions
Sabine Broda, Ant\'onio Machiavelo, Nelma Moreira, Rog\'erio, Reis

TL;DR
This paper investigates the properties of a subset of regular expressions that avoid certain patterns, demonstrating that their average automaton size estimates align with those of the full set despite being significantly smaller.
Contribution
It introduces a refined set of regular expressions avoiding a specific pattern and shows their asymptotic automaton size estimates match the standard set.
Findings
The subset is significantly smaller than the full set.
Asymptotic estimates for automaton size are unchanged.
Refined sets can preserve average-case properties.
Abstract
Although regular expressions do not correspond univocally to regular languages, it is still worthwhile to study their properties and algorithms. For the average case analysis one often relies on the uniform random generation using a specific grammar for regular expressions, that can represent regular languages with more or less redundancy. Generators that are uniform on the set of expressions are not necessarily uniform on the set of regular languages. Nevertheless, it is not straightforward that asymptotic estimates obtained by considering the whole set of regular expressions are different from those obtained using a more refined set that avoids some large class of equivalent expressions. In this paper we study a set of expressions that avoid a given absorbing pattern. It is shown that, although this set is significantly smaller than the standard one, the asymptotic average estimates…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
