Regular expression length via arithmetic formula complexity
Ehud Cseresnyes, Hannes Seiwert

TL;DR
This paper establishes new lower bounds on the length of regular expressions for finite languages by leveraging techniques from arithmetic circuit complexity, connecting regular expression size to arithmetic formula complexity.
Contribution
It introduces a novel reduction from regular expression length to arithmetic formula size and adapts lower bound methods to regular expressions, providing tight bounds for specific languages.
Findings
Lower bounds for regular expressions of binomial and Dyck languages.
Analysis of language operation blow-ups (intersection and shuffle).
Almost tight bounds for languages of binary numbers divisible by p, and permutations.
Abstract
We prove lower bounds on the length of regular expressions for finite languages by methods from arithmetic circuit complexity. First, we show a reduction: the length of a regular expression for a language is bounded from below by the minimum size of a monotone arithmetic formula computing a polynomial that has as its set of exponent vectors, viewing words as vectors. This result yields lower bounds for the binomial language of all words with exactly ones and zeros and for the language of all Dyck words of length . We also determine the blow-up of language operations (intersection and shuffle) of regular expressions for finite languages. Second, we adapt a lower bound method for multilinear arithmetic formulas by so-called log-product polynomials to regular expressions. With this method we show almost tight lower bounds for the language of all…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topicssemigroups and automata theory · Coding theory and cryptography · Advanced Combinatorial Mathematics
