Parikh Image of Pushdown Automata

Pierre Ganty; Elena Guti\'errez

arXiv:1706.08315·cs.FL·June 27, 2017

Parikh Image of Pushdown Automata

Pierre Ganty, Elena Guti\'errez

PDF

TL;DR

This paper analyzes the complexity of converting pushdown automata to context-free grammars and finite automata, establishing optimal bounds and exploring Parikh equivalence, especially for unary automata and deterministic cases.

Contribution

It proves the optimality of classical PDA-to-grammar conversion algorithms for unary automata and introduces a new efficient conversion for unary deterministic PDAs.

Findings

01

Conversion algorithm is optimal for unary PDAs.

02

Parikh equivalence simplifies automata comparison.

03

New bounds for unary deterministic PDA to grammar conversion.

Abstract

We compare pushdown automata (PDAs for short) against other representations. First, we show that there is a family of PDAs over a unary alphabet with $n$ states and $p \geq 2 n + 4$ stack symbols that accepts one single long word for which every equivalent context-free grammar needs $Ω (n^{2} (p - 2 n - 4))$ variables. This family shows that the classical algorithm for converting a PDA to an equivalent context-free grammar is optimal even when the alphabet is unary. Moreover, we observe that language equivalence and Parikh equivalence, which ignores the ordering between symbols, coincide for this family. We conclude that, when assuming this weaker equivalence, the conversion algorithm is also optimal. Second, Parikh's theorem motivates the comparison of PDAs against finite state automata. In particular, the same family of unary PDAs gives a lower bound on the number of states of every…

Equations69

{[q X q^{'}] ∣ q, q^{'} \in Q, X \in Γ} \cup {S} .

{[q X q^{'}] ∣ q, q^{'} \in Q, X \in Γ} \cup {S} .

\cup {S \to [q_{0} Z_{0} q] ∣ q \in Q} {[q X r_{d}] \to b [q^{'} (β)_{1} r_{1}] \dots [r_{d - 1} (β)_{d} r_{d}] ∣ (q, X) ↪_{b} (q^{'}, β), d = ∣ β ∣, r_{1}, \dots, r_{d} \in Q}

\cup {S \to [q_{0} Z_{0} q] ∣ q \in Q} {[q X r_{d}] \to b [q^{'} (β)_{1} r_{1}] \dots [r_{d - 1} (β)_{d} r_{d}] ∣ (q, X) ↪_{b} (q^{'}, β), d = ∣ β ∣, r_{1}, \dots, r_{d} \in Q}

d (t) = {ma x_{i \in {1, ..., k}} d (t_{i}) ma x_{i \in {1, ..., k}} d (t_{i}) + 1 if there is a unique maximum, otherwise .

d (t) = {ma x_{i \in {1, ..., k}} d (t_{i}) ma x_{i \in {1, ..., k}} d (t_{i}) + 1 if there is a unique maximum, otherwise .

a_{1} 2 (a_{2} 1 (a_{3} 1 (a_{4} 0, a_{4} 0), a_{5} 0), a_{2} 1 (a_{3} 1 (a_{4} 0, a_{4} 0), a_{5} 0)) .

a_{1} 2 (a_{2} 1 (a_{3} 1 (a_{4} 0, a_{4} 0), a_{5} 0), a_{2} 1 (a_{3} 1 (a_{4} 0, a_{4} 0), a_{5} 0)) .

\begin{array}[t]{r@{\;}c@{\;}lr}(q_{0},S)&\hookrightarrow_{b}&(q_{0},X_{k}\,r_{0})&\\ (q_{i},X_{j})&\hookrightarrow_{b}&(q_{i},X_{j-1}\,r_{m}\,s_{i}\,X_{j-1}\,r_{m})&\forall\,i,m\in\{0,\ldots,n-1\},\forall\,j\in\{1,\ldots,k\},\\ (q_{j},s_{i})&\hookrightarrow_{b}&(q_{i},\varepsilon)&\forall i,j\in\{0,\ldots,n-1\},\\ (q_{i},r_{i})&\hookrightarrow_{b}&(q_{i},\varepsilon)&\forall i\in\{0,\ldots,n-1\},\\ (q_{i},X_{0})&\hookrightarrow_{b}&(q_{i},X_{k}\,\star)&\forall i\in\{0,\ldots,n-1\},\\ (q_{i},X_{0})&\hookrightarrow_{b}&(q_{i+1},X_{k}\,\$)&\forall i\in\{0,\ldots,n-2\},\\ (q_{i},\star)&\hookrightarrow_{b}&(q_{i-1},\varepsilon)&\forall i\in\{1,\ldots,n-1\},\\ (q_{0},\$)&\hookrightarrow_{b}&(q_{n-1},\varepsilon)&\\ (q_{n-1},X_{0})&\hookrightarrow_{b}&(q_{n-1},\varepsilon)\end{array}

\begin{array}[t]{r@{\;}c@{\;}lr}(q_{0},S)&\hookrightarrow_{b}&(q_{0},X_{k}\,r_{0})&\\ (q_{i},X_{j})&\hookrightarrow_{b}&(q_{i},X_{j-1}\,r_{m}\,s_{i}\,X_{j-1}\,r_{m})&\forall\,i,m\in\{0,\ldots,n-1\},\forall\,j\in\{1,\ldots,k\},\\ (q_{j},s_{i})&\hookrightarrow_{b}&(q_{i},\varepsilon)&\forall i,j\in\{0,\ldots,n-1\},\\ (q_{i},r_{i})&\hookrightarrow_{b}&(q_{i},\varepsilon)&\forall i\in\{0,\ldots,n-1\},\\ (q_{i},X_{0})&\hookrightarrow_{b}&(q_{i},X_{k}\,\star)&\forall i\in\{0,\ldots,n-1\},\\ (q_{i},X_{0})&\hookrightarrow_{b}&(q_{i+1},X_{k}\,\$)&\forall i\in\{0,\ldots,n-2\},\\ (q_{i},\star)&\hookrightarrow_{b}&(q_{i-1},\varepsilon)&\forall i\in\{1,\ldots,n-1\},\\ (q_{0},\$)&\hookrightarrow_{b}&(q_{n-1},\varepsilon)&\\ (q_{n-1},X_{0})&\hookrightarrow_{b}&(q_{n-1},\varepsilon)\end{array}

a_{2}

a_{2}

a_{1}

a_{2}

a_{2}

a_{3}

a_{1}

a d + 1 (a_{1} d (\dots), a_{2} 0, a_{3} 0, a_{1} d (\dots), a_{2} 0) .

a d + 1 (a_{1} d (\dots), a_{2} 0, a_{3} 0, a_{1} d (\dots), a_{2} 0) .

a_{2}

a_{2}

a_{1}

a (a_{1} (a_{11} (\dots), a_{12}), a_{2}, a_{3}, a_{1} (a_{11} (\dots), a_{12}), a_{2}) .

a (a_{1} (a_{11} (\dots), a_{12}), a_{2}, a_{3}, a_{1} (a_{11} (\dots), a_{12}), a_{2}) .

a_{2}

a_{2}

a_{3}

a_{1}

a_{12}

a_{11}

a d + 1 (a_{1} d (a_{11} d (\dots), a_{12} 0), a_{2} 0, a_{3} 0, a_{1} d (a_{11} d (\dots), a_{12} 0), a_{2} 0) .

a d + 1 (a_{1} d (a_{11} d (\dots), a_{12} 0), a_{2} 0, a_{3} 0, a_{1} d (a_{11} d (\dots), a_{12} 0), a_{2} 0) .

a_{2}

a_{2}

a_{3}

a_{1}

a d + 1 (a_{1} d (\dots), a_{2} 0, a_{3} 0, a_{1} d (\dots), a_{2} 0) .

a d + 1 (a_{1} d (\dots), a_{2} 0, a_{3} 0, a_{1} d (\dots), a_{2} 0) .

D = {(q_{i}, X_{j}) ↪ (q_{i}, X_{j - 1} r_{m} s_{i} X_{j - 1} r_{m}) ∣ 1 \leq j \leq k, 0 \leq i, m \leq n - 1} .

D = {(q_{i}, X_{j}) ↪ (q_{i}, X_{j - 1} r_{m} s_{i} X_{j - 1} r_{m}) ∣ 1 \leq j \leq k, 0 \leq i, m \leq n - 1} .

R_{P} (q, X) = {(q^{'}, β) ∣ \exists (q, X) ⊢ \dots ⊢ (q^{'}, β)} .

R_{P} (q, X) = {(q^{'}, β) ∣ \exists (q, X) ⊢ \dots ⊢ (q^{'}, β)} .

(q, X) ⊢ \dots ⊢

(q, X) ⊢ \dots ⊢

(q, X) ⊢ \dots ⊢

δ

δ

\cup {(q, X) ↪_{ε} (sink, X) ∣ X \in Γ^{'}, q \in F}

\cup {(sink, X) ↪_{ε} (sink, ε) ∣ X \in Γ^{'}} .

r_{i} = (I_{p_{i - 1}})_{≫_{n_{i}}} ⊢ \dots ⊢ (I_{p_{i}})_{≫_{n_{i}}} .

r_{i} = (I_{p_{i - 1}})_{≫_{n_{i}}} ⊢ \dots ⊢ (I_{p_{i}})_{≫_{n_{i}}} .

r

r

(q_{1}, ⋆ X_{0}) ⊢ (q_{0}, X_{0}) ⊢ (q_{1}, X_{1} ⋆) ⊢ (q_{1}, X_{0} X_{0} ⋆) ⊢ (q_{1}, X_{0} ⋆) ⊢ (q_{1}, ⋆) ⊢ (q_{0}, ε) .

r_{1}

r_{1}

= (I_{1})_{≫_{1}} ⊢^{*} (I_{6})_{≫_{1}}

= (q_{0}, X_{0}) ⊢ (q_{1}, X_{1} ⋆) ⊢ (q_{1}, X_{0} X_{0} ⋆) ⊢

(q_{1}, X_{0} ⋆) ⊢ (q_{1}, ⋆) ⊢ (q_{0}, ε)

r_{2}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

11institutetext: IMDEA Software Institute, Madrid, Spain 22institutetext: Universidad Politécnica de Madrid, Spain 22email: {pierre.ganty,elena.gutierrez}@imdea.org

Parikh Image of Pushdown Automata

Pierre Ganty*,* Pierre Ganty has been supported by the Madrid Regional Government project S2013/ICE-2731, N-Greens Software - Next-GeneRation Energy-EfficieNt Secure Software, and the Spanish Ministry of Economy and Competitiveness project No. TIN2015-71819-P, RISCO - RIgorous analysis of Sophisticated COncurrent and distributed systems.11

Elena Gutiérrez*,* Elena Gutiérrez is partially supported by BES-2016-077136 grant from the Spanish Ministry of Economy, Industry and Competitiveness.1122

Abstract

We compare pushdown automata (PDAs for short) against other representations. First, we show that there is a family of PDAs over a unary alphabet with $n$ states and $p\geq 2n+4$ stack symbols that accepts one single long word for which every equivalent context-free grammar needs $\Omega(n^{2}(p-2n-4))$ variables. This family shows that the classical algorithm for converting a PDA to an equivalent context-free grammar is optimal even when the alphabet is unary. Moreover, we observe that language equivalence and Parikh equivalence, which ignores the ordering between symbols, coincide for this family. We conclude that, when assuming this weaker equivalence, the conversion algorithm is also optimal. Second, Parikh’s theorem motivates the comparison of PDAs against finite state automata. In particular, the same family of unary PDAs gives a lower bound on the number of states of every Parikh-equivalent finite state automaton. Finally, we look into the case of unary deterministic PDAs. We show a new construction converting a unary deterministic PDA into an equivalent context-free grammar that achieves best known bounds.

1 Introduction

Given a context-free language which representation, pushdown automata or context-free grammars, is more concise? This was the main question studied by Goldstine et al. [8] in a paper where they introduced an infinite family of context-free languages whose representation by a pushdown automaton is more concise than by context-free grammars. In particular, they showed that each language of the family is accepted by a pushdown automaton with $n$ states and $p$ stack symbols, but every context-free grammar needs at least $n^{2}p+1$ variables if $n>1$ ( $p$ if $n=1$ ). Incidentally, the family shows that the translation of a pushdown automaton into an equivalent context-free grammar used in textbooks [9], which uses the same large number of $n^{2}p+1$ variables if $n>1$ ( $p$ if $n=1$ ), is optimal in the sense that there is no other algorithm that always produces fewer grammar variables.

Today we revisit these questions but this time we turn our attention to the unary case. We define an infinite family of context-free languages as Goldstine et al. did but our family differs drastically from theirs. Given $n\geq 1$ and $k\geq 1$ , each member of our family is given by a PDA with $n$ states, $p=k+2n+4$ stack symbols and a single input symbol.111Their family has an alphabet of non-constant size. We show that, for each PDA of the family, every equivalent context-free grammar has $\Omega({n^{2}(p-2n-4)})$ variables. Therefore, this family shows that the textbook translation of a PDA into a language-equivalent context-free grammar is optimal222In a sense that we will precise in Section 4 (Remark 1). even when the alphabet is unary. Note that if the alphabet is a singleton, equality over words (two words are equal if the same symbols appear at the same positions) coincides with Parikh equivalence (two words are Parikh-equivalent if each symbol occurs equally often in both words333But not necessarily at the same positions, e.g. $ab$ and $ba$ are Parikh-equivalent.). Thus, we conclude that the conversion algorithm is also optimal for Parikh equivalence. We also investigate the special case of deterministic PDAs over a singleton alphabet for which equivalent context-free grammar representations of small size had been defined [3, 10]. We give a new definition for an equivalent context-free grammar given a unary deterministic PDA. Our definition is constructive (as far as we could tell the result of Pighizzini [10] is not) and achieves the best known bounds [3] by combining two known constructions.

Parikh’s theorem [11] states that every context-free language has the same Parikh image as some regular language. This allows us to compare PDAs against finite state automata (FSAs for short) for Parikh-equivalent languages. First, we use the same family of PDAs to derive a lower bound on the number of states of every Parikh-equivalent FSA. The comparison becomes simple as its alphabet is unary and it accepts one single word. Second, using this lower bound we show that the 2-step procedure chaining existing constructions:

(i) translate the PDA into a language-equivalent context-free grammar [9]; and

(ii) translate the context-free grammar into a Parikh-equivalent FSA [4]

yields optimal444In a sense that we will precise in Section 5 (Remark 2). results in the number of states of the resulting FSA.

As a side contribution, we introduce a semantics of PDA runs as trees that we call actrees. The richer tree structure (compared to a sequence) makes simpler to compare each PDA of the family with its smallest grammar representation.

Structure of the paper.

After preliminaries in Section 2 we introduce the tree-based semantics in 3. In Section 4 we compare PDAs and context-free grammars when they represent Parikh-equivalent languages. We will define the infinite family of PDAs and establish their main properties. We dedicate Section 4.2 to the special case of deterministic PDAs over a unary alphabet. Finally, Section 5 focuses on the comparison of PDAs against finite state automata for Parikh-equivalent languages.

2 Preliminaries

A pushdown automaton (or PDA) is a 6-tuple $(Q,\Sigma,\Gamma,\delta,q_{0},Z_{0})$ where $Q$ is a finite nonempty set of states including $q_{0}$ , the initial state; $\Sigma$ is the input alphabet; $\Gamma$ is the stack alphabet including $Z_{0}$ , the initial stack symbol; and $\delta$ is a finite subset of $Q\times\Gamma\times(\Sigma\cup\{\varepsilon\})\times Q\times\Gamma^{*}$ called the actions. We write $(q,X)\hookrightarrow_{b}(q^{\prime},\beta)$ to denote an action $(q,X,b,q^{\prime},\beta)\in\delta$ . We sometimes omit the subscript to the arrow.

An instantaneous description (or ID) of a PDA is a pair $(q,\beta)$ where $q\in Q$ and $\beta\in\Gamma^{*}$ . We call the first component of an ID the state and the second the stack content. The initial ID consists of the initial state and the initial stack symbol for the stack content. When reasoning formally, we use the functions $\mathit{state}$ and $\mathit{stack}$ which, given an ID, returns its state and stack content, respectively.

An action $(q,X)\hookrightarrow_{b}(q^{\prime},\beta)$ is enabled at ID $I$ if $\mathit{state}(I)=q$ and $(\,\mathit{stack}(I)\,)_{1}=X$ .555 $(w)_{i}$ is the $i$ -th symbol of $w$ if $1\leq i\leq{|{w}|}$ ; else $(w)_{i}=\varepsilon$ . ${|{w}|}$ is the length of $w$ . Given an ID $(q,X\gamma)$ enabling $(q,X)\hookrightarrow_{b}(q^{\prime},\beta)$ , define the successor ID to be $(q^{\prime},\beta\gamma)$ . We denote this fact as $(q,X\gamma)\vdash_{b}(q^{\prime},\beta\gamma)$ , and call it a move that consumes $b$ from the input.666When $b=\varepsilon$ the move does not consume input. We sometimes omit the subscript of $\vdash$ when the input consumed (if any) is not important. Given $n\geq 0$ , a move sequence, denoted $I_{0}\vdash_{b_{1}}{\cdots}\vdash_{b_{n}}I_{n}$ , is a finite sequence of IDs $I_{0}I_{1}\ldots I_{n}$ such that $I_{i}\vdash_{b_{i}}I_{i+1}$ for all $i$ . The move sequence consumes $w$ (from the input) when $b_{1}\cdots b_{n}=w$ . We concisely denote this fact as $I_{0}\vdash{\stackrel{{\scriptstyle w}}{{\ldots}}}\vdash I_{n}$ . A move sequence $I\vdash{\cdots}\vdash I^{\prime}$ is a quasi-run when ${|{\mathit{stack}(I)}|}=1$ and ${|{\mathit{stack}(I^{\prime})}|}=0$ ; and a run when, furthermore, $I$ is the initial ID. Define the language of a PDA $P$ as $L(P)=\{w\in\Sigma^{*}\mid P\text{ has a run consuming }w\}$ .

The Parikh image of a word $w$ over an alphabet $\{b_{1},\ldots,b_{n}\}$ , denoted by $\lbag w\rbag$ , is the vector $(x_{1},\ldots,x_{n})\in\mathbb{N}^{n}$ such that $x_{i}$ is the number of occurrences of $b_{i}$ in $w$ . The Parikh image of a language $L$ , denoted by $\lbag L\rbag$ , is the set of Parikh images of its words. When $\lbag L_{1}\rbag=\lbag L_{2}\rbag$ , we say $L_{1}$ and $L_{2}$ are Parikh-equivalent.

We assume the reader is familiar with the basics of finite state automata (or FSA for short) and context-free grammars (or CFG). Nevertheless we fix their notation as follows. We denote a FSA as a tuple $(Q,\Sigma,\delta,q_{0},F)$ where $Q$ is a finite set of states including the initial state $q_{0}$ and the final states $F$ ; $\Sigma$ is the input alphabet and $\delta\subseteq Q\times(\Sigma\cup\{\varepsilon\})\times Q$ is the set of transitions. We denote a CFG as a tuple $(V,\Sigma,S,R)$ where $V$ is a finite set of variables including $S$ the start variable, $\Sigma$ is the alphabet or set of terminals and $R\subseteq V\times(V\cup\Sigma)^{*}$ is a finite set of rules. Rules are conveniently denoted $X\rightarrow\alpha$ . Given a FSA $A$ and a CFG $G$ we denote their languages as $L(A)$ and $L(G)$ , respectively.

Finally, let us recall the translation of a PDA into an equivalent CFG.

Given a PDA $P=(Q,\Sigma,\Gamma,\delta,q_{0},Z_{0})$ , define the CFG $G=(V,\Sigma,R,S)$ where

•

The set $V$ of variables — often called the triples — is given by

[TABLE]

•

The set $R$ of production rules is given by

[TABLE]

For a proof of correctness, see the textbook of Ullman et al. [9]. The previous definition easily translates into a conversion algorithm. Observe that the runtime of such algorithm depends polynomially on ${|{Q}|}$ and ${|{\Gamma}|}$ , but exponentially on ${|{\beta}|}$ .

3 A Tree-Based Semantics for Pushdown Automata

In this section we introduce a tree-based semantics for PDA. Using trees instead of sequences sheds the light on key properties needed to present our main results.

Given an action $a$ denoted by $(q,X)\hookrightarrow_{b}(q^{\prime},\beta)$ , $q$ is the source state of $a$ , $q^{\prime}$ the target state of $a$ , $X$ the symbol $a$ pops and $\beta$ the (possibly empty) sequence of symbols $a$ pushes.

A labeled tree $c(t_{1},\ldots,t_{k})$ $(k\geq 0)$ is a finite tree whose nodes are labeled, where $c$ is the label of the root and $t_{1},\ldots,t_{k}$ are labeled trees, the children of the root. When $k=0$ we prefer to write $c$ instead of $c()$ . Each labeled tree $t$ defines a sequence, denoted ${\overline{t}}$ , obtained by removing the symbols ‘(’, ‘)’ or ‘,’ when interpreting $t$ as a string, e.g. ${\overline{c(c_{1},c_{2}(c_{21}))}}=c\,c_{1}\,c_{2}\,c_{21}$ . The size of a labeled tree $t$ , denoted ${|{t}|}$ , is given by ${|{{\overline{t}}}|}$ . It coincides with the number of nodes in $t$ .

Definition 1

Given a PDA $P$ , an action-tree (or actree for short) is a labeled tree $a(a_{1}(\ldots),\ldots,a_{d}(\ldots))$ where $a$ is an action of $P$ pushing $\beta$ with ${|{\beta}|}=d$ and each children $a_{i}(\ldots)$ is an actree such that $a_{i}$ pops $(\beta)_{i}$ for all $i$ . Furthermore, an actree $t$ must satisfy that the source state of $({\overline{t}})_{i+1}$ and the target state of $({\overline{t}})_{i}$ coincide for every $i$ .

An actree $t$ consumes an input resulting from replacing each action in the sequence ${\overline{t}}$ by the symbol it consumes (or $\varepsilon$ , if the action does not consume any). An actree $a(\ldots)$ is accepting if the initial ID enables $a$ .

Example 1

Consider a PDA $P$ with actions $a_{1}$ to $a_{5}$ respectively given by $(q_{0},X_{1})\hookrightarrow_{\varepsilon}(q_{0},X_{0}\,X_{0})$ , $(q_{0},X_{0})\hookrightarrow_{\varepsilon}(q_{1},X_{1}\,\star)$ , $(q_{1},X_{1})\hookrightarrow_{\varepsilon}(q_{1},X_{0}\,X_{0})$ ,

$(q_{1},X_{0})\hookrightarrow_{b}(q_{1},\varepsilon)$ and $(q_{1},\star)\hookrightarrow_{\varepsilon}(q_{0},\varepsilon)$ . The reader can check that the actree $t=a_{1}(a_{2}(a_{3}(a_{4},a_{4}),a_{5}),a_{2}(a_{3}(a_{4},a_{4}),a_{5}))$ , depicted in Figure 1, satisfies the conditions of Definition 1 where ${\overline{t}}=a_{1}\,a_{2}\,a_{3}\,a_{4}\,a_{4}\,a_{5}\,a_{2}\,a_{3}\,a_{4}\,a_{4}\,a_{5}$ , ${|{t}|}=11$ and the input consumed is $b^{4}$ .

We recall the notion of dimension of a labeled tree [5] and we relate dimension and size of labeled trees in Lemma 1.

Definition 2

The dimension of a labeled tree $t$ , denoted as $d(t)$ , is inductively defined as follows. $d(t)=0$ if $t=c$ , otherwise we have $t=c(t_{1},\ldots,t_{k})$ for some $k>0$ and

[TABLE]

Example 2

The annotation $\stackrel{{\scriptstyle d(t)}}{{t}}\!\!(\ldots)$ shows the actree of Example 1 has dimension $2$

[TABLE]

Lemma 1

${|{t}|}\geq 2^{d(t)}$ * for every labeled tree $t$ .*

The proof of the lemma is given in the Appendix. The actrees and the quasi-runs of a PDAs are in one-to-one correspondence as reflected in Theorem 3.1 whose proof is in the Appendix.

Theorem 3.1

Given a PDA, its actrees and quasi-runs are in a one-to-one correspondence.

4 Parikh-Equivalent Context-free Grammars

In this section we compare PDAs against CFGs when they describe Parikh-equivalent languages. We first study the general class of (nondeterministic) PDAs and, in Section 4.2, we look into the special case of unary deterministic PDAs.

We prove that, for every $n\geq 1$ and $p\geq 2n+4$ , there exists a PDA with $n$ states and $p$ stack symbols for which every Parikh-equivalent CFG has $\Omega({n^{2}(p-2n-4)})$ variables. To this aim, we present a family of PDAs $P(n,k)$ where $n\geq 1$ and $k\geq 1$ . Each member of the family has $n$ states and $k+2n+4$ stack symbols, and accepts one single word over a unary input alphabet.

4.1 The Family $P(n,k)$ of PDAs

Definition 3

Given natural values $n\geq 1$ and $k\geq 1$ , define the PDA $P(n,k)$ with states $Q=\{{q_{i}\mid{0\leq i\leq n-1}}\}$ , input alphabet $\Sigma=\{b\}$ , stack alphabet $\Gamma=\{S,\,\star,\,\$ }\cup{X_{i}\mid 0\leq i\leq k}\cup{s_{i}\mid 0\leq i\leq n-1}\cup{r_{i}\mid 0\leq i\leq n-1} $, initial state$ q_{0} $, initial stack symbol$ S $and actions$ \delta$

[TABLE]

Lemma 2

Given $n\geq 1$ and $k\geq 1$ , $P(n,k)$ has a single accepting actree consuming input $b^{N}$ where $N\geq 2^{n^{2}\,k}$ .

Proof

Fix values $n$ and $k$ and refer to the member of the family $P(n,k)$ as $P$ . We show that $P$ has exactly one accepting actree. We define a witness labeled tree $t$ inductively on the structure of the tree. Later we will prove that the induction is finite. First, we show how to construct the root and its children subtrees. This corresponds to case 1 below. Then, each non-leaf subtree is defined inductively in cases 2 to 5. Note that each non-leaf subtree of $t$ falls into one (and only one) of the cases. In fact, all cases are disjoint, in particular 2, 4 and 5. The reverse is also true: all cases describe a non-leaf subtree that does occur in $t$ . Finally, we show that each case describes uniquely how to build the next layer of children subtrees of a given non-leaf subtree.

$t=a(a_{1}(\ldots),a_{2})$ where $a=(q_{0},S)\hookrightarrow_{b}(q_{0},X_{k}\,r_{0})$ and $a_{1}(\ldots)$ and $a_{2}$ are of the form:

[TABLE]

Note that the initial ID $(q_{0},S)$ enables $a$ which is the only action of $P$ with this property. Note also that $\stackrel{{\scriptstyle d}}{{a}}(\stackrel{{\scriptstyle d}}{{a_{1}}}(\ldots),\stackrel{{\scriptstyle 0}}{{a_{2}}})$ holds, where $d>0$ . 2. 2.

Each subtree whose root is labeled $a=(q_{i},X_{j})\hookrightarrow_{b}(q_{i},X_{j-1}\,r_{m}\,s_{i}\,X_{j-1}\,r_{m})$ with $i,m\in\{0,\ldots,n-1\}$ and $j\in\{2,\ldots,k\}$ has the form $a(a_{1}(\ldots),a_{2},a_{3},a_{1}(\ldots),a_{2})$ where

[TABLE]

Assume for now that $t$ is unique. Therefore, as the 1st and 4th child of $a$ share the same label $a_{1}$ , they also root the same subtree. Thus, it holds ( $d>0$ )

[TABLE] 3. 3.

Each subtree whose root is labeled $a=(q_{i},X_{0})\hookrightarrow_{b}(q_{i+1},X_{k}\,\$ ) $with$ i\in{0,\dots,n-2} $has the form$ a(a_{1}(\ldots),a_{2})$ where

[TABLE]

Note that $\stackrel{{\scriptstyle d}}{{a}}(\stackrel{{\scriptstyle d}}{{a_{1}}}(\ldots),\stackrel{{\scriptstyle 0}}{{a_{2}}})$ holds, where $d>0$ . 4. 4.

Each subtree whose root is labeled $a=(q_{i},X_{1})\hookrightarrow_{b}(q_{i},X_{0}\,r_{m}\,s_{i}\,X_{0}\,r_{m})$ with $i\in\{0,\dots,n-1\}$ and $m\in\{0,\ldots,n-2\}$ has the form

[TABLE]

where

[TABLE]

Assume $a_{1}$ is given by the action $(q_{i},X_{0})\hookrightarrow_{b}(q_{i+1},X_{k}\,\$ ) $instead. Then following the action popping$ $ $, we would end up in the state$ q_{n-1} $, not enabling$ a_{2} $since$ m<n-1$.

Again, assume for now that $t$ is unique. Hence, as the 1st and 4th child of $a$ are both labeled by $a_{1}$ , they root the same subtree. Thus, it holds ( $d>0$ )

[TABLE] 5. 5.

Each subtree whose root is labeled $a=(q_{i},X_{1})\hookrightarrow_{b}(q_{i},X_{0}\,r_{n-1}\,s_{i}\,X_{0}\,r_{n-1})$ with $i\in\{0,\ldots,n-1\}$ has the form $a(a_{1}(\ldots),a_{2},a_{3},a_{1}(\ldots),a_{2})$ where

[TABLE]

For both cases ( $i<n-1$ and $i=n-1$ ), assume $a_{1}$ is given by $(q_{i},X_{0})\hookrightarrow_{b}(q_{i},X_{k}\,\star)$ instead. Then, the action popping $\star$ must end up in the state $q_{n-1}$ in order to enable $a_{2}$ , i.e., it must be of the form $(q_{n},\star)\hookrightarrow_{b}(q_{n-1},\varepsilon)$ . Hence the action popping $X_{k}$ must be of the form $(q_{i},X_{k})\hookrightarrow_{b}(q_{i},X_{k-1}\,r_{m}\,s_{i}\,X_{k-1}\,r_{m})$ where necessarily $m=n$ , a contradiction (the stack symbol $r_{n}$ is not defined in $P$ ).

Assume for now that $t$ is unique. Then, as the 1st and 4th child of $a$ are labeled by $a_{1}$ , they root the same subtree (possibly a leaf). Thus, it holds ( $d\geq 0$ )

[TABLE]

We now prove that $t$ is finite by contradiction. Suppose $t$ is an infinite tree. König’s Lemma shows that $t$ has thus at least one infinite path, say $p$ , from the root. As the set of labels of $t$ is finite then some label must repeat infinitely often along $p$ . Let us define a strict partial order between the labels of the non-leaf subtrees of $t$ . We restrict to the non-leaf subtrees because no infinite path contains a leaf subtree. Let $a_{1}(\ldots)$ and $a_{2}(\ldots)$ be two non-leaf subtrees of $t$ . Let $q_{i_{1}}$ be the source state of $a_{1}$ and $q_{f_{1}}$ be the target state of the last action in the sequence ${\overline{a_{1}(\ldots)}}$ . Define $q_{i_{2}},q_{f_{2}}$ similarly for $a_{2}(\ldots)$ . Let $X_{j_{1}}$ be the symbol that $a_{1}$ pops and $X_{j_{2}}$ be the symbol that $a_{2}$ pops. Define $a_{1}\prec a_{2}$ iff

(a) either $i_{1}<i_{2}$ ,

(b) or $i_{1}=i_{2}$ and $f_{1}<f_{2}$ ,

(c) or $i_{1}=i_{2},f_{1}=f_{2}$ and $j_{1}>j_{2}$ .

First, note that the label $a$ of the root of $t$ (case 1) only occurs in the root as there is no action of $P$ pushing $S$ . Second, relying on cases 2 to 5, we observe that every pair of non-leaf subtrees $a_{1}(\ldots)$ and $a_{2}(\ldots)$ (excluding the root) such that $a_{1}(\ldots)$ is the parent node of $a_{2}(\ldots)$ verifies $a_{1}(\ldots)\prec a_{2}(\ldots)$ . Using the transitive property of the strict partial order $\prec$ , we conclude that everypair of subtrees $a_{1}(\ldots)$ and $a_{2}(\ldots)$ in $p$ such that $a_{1}(\ldots a_{2}(\ldots)\ldots)$ verifies $a_{1}(\ldots)\prec a_{2}(\ldots)$ . Therefore, no repeated variable can occur in $p$ (contradiction). We conclude that $t$ is finite.

The reader can observe that $t=a(\ldots)$ verifies all conditions of the definition of actree (Definition 1) and the initial ID enables $a$ , thus it is an accepting actree of $P$ . Since we also showed that no other tree can be defined using the actions of $P$ , $t$ is unique.

Finally, we give a lower bound on the length of the word consumed by $t$ . To this aim, we prove that $d(t)=n^{2}\,k$ . Then since all actions consume input symbol $b$ , Lemma 1 shows that the word $b^{N}$ consumed is such that $N\geq 2^{n^{2}\,k}$ .

Note that, if a subtree of $t$ verifies case $1$ or $3$ , its dimension remains the same w.r.t. its children subtrees. Otherwise, the dimension always grows. Recall that all cases from 1 to 5 describe a set of labels that does occur in $t$ . Also, as $t$ is unique, no path from the root to a leaf repeats a label. Thus, to compute the dimension of $t$ is enough to count the number of distinct labels of $t$ that are included in cases $2$ , $4$ and $5$ , which is equivalent to compute the size of the set

[TABLE]

Clearly ${|{D}|}=n^{2}\,k$ from which we conclude that $d(t)=n^{2}\,k$ . Hence, ${|{t}|}\geq 2^{n^{2}\,k}$ and therefore $t$ consumes a word $b^{N}$ where $N\geq 2^{n^{2}\,k}$ since each action of $t$ consumes a $b$ . ∎

The reader can find in the Appendix a depiction of the accepting actree corresponding to $P(2,1)$ .

Theorem 4.1

For each $n\geq 1$ and $p>2n+4$ , there is a PDA with $n$ states and $p$ stack symbols for which every Parikh-equivalent CFG has $\Omega(n^{2}(p-2n-4))$ variables.

Proof

Consider the family of PDAs $P(n,k)$ with $n\geq 1$ and $k\geq 1$ described in Definition 3. Fix $n$ and $k$ and refer to the corresponding member of the family as $P$ .

First, Lemma 2 shows that $L(P)$ consists of a single word $b^{N}$ with $N\geq 2^{n^{2}\,k}$ . It follows that a language $L$ is Parikh-equivalent to $L(P)$ iff $L$ is language-equivalent to $L(P)$ .

Let $G$ be a CFG such that $L(G)=L(P)$ . The smallest CFG that generates exactly one word of length $\ell$ has size $\Omega(log(\ell))$ [2, Lemma 1], where the size of a grammar is the sum of the length of all the rules. It follows that $G$ is of size $\Omega(log(2^{n^{2}k}))=\Omega({n^{2}k})$ . As $k=p-2n-4$ , then $G$ has size $\Omega({n^{2}(p-2n-4)})$ . We conclude that $G$ has $\Omega({n^{2}\,(p-2n-4)})$ variables. ∎

Remark 1

According to the classical conversion algorithm, every CFG that is equivalent to $P(n,k)$ needs at least $n^{2}(k+2n+4)\in\mathcal{O}(n^{2}k+n^{3})$ variables. On the other hand, Theorem 4.1 shows that a lower bound for the number of variables is $\Omega({n^{2}k})$ . We observe that, as long as $n\leq Ck$ for some positive constant $C$ , the family $P(n,k)$ shows that the conversion algorithm is optimal 777Note that if $n\leq Ck$ for some $C>0$ then the $n^{3}$ addend in $\mathcal{O}(n^{2}k+n^{3})$ becomes negligible compared to $n^{2}k$ , and the lower and upper bound coincide. in the number of variables when assuming both language and Parikh equivalence. Otherwise, the algorithm is not optimal as there exists a gap between the lower bound and the upper bound. For instance, if $n=k^{2}$ then the upper bound is $\mathcal{O}(k^{5}+k^{6})=\mathcal{O}(k^{6})$ while the lower bound is $\Omega(k^{5})$ .

4.2 The Case of Unary Deterministic Pushdown Automata

We have seen that the classical translation from PDA to CFG is optimal in the number of grammar variables for the family of unary nondeterministic PDA $P(n,k)$ when $n$ is in linear relation with respect to $k$ (see Remark 1). However, for unary deterministic PDA (UDPDA for short) the situation is different. Pighizzini [10] shows that for every UDPDA with $n$ states and $p$ stack symbols, there exists an equivalent CFG with at most $2np$ variables. Although he gives a definition of such a grammar, we were not able to extract an algorithm from it. On the other hand, Chistikov and Majumdar [3] give a polynomial time algorithm that transforms a UDPDA into an equivalent CFG going through the construction of a pair of straight-line programs. The size of the resulting CFG is linear in that of the UDPDA.

We propose a new polynomial time algorithm that converts a UDPDA with $n$ states and $p$ stack symbols into an equivalent CFG with $\mathcal{O}(np)$ variables. Our algorithm is based on the observation that the conversion algorithm from PDAs to CFGs need not consider all the triples in (1). We discard unnecessary triples using the saturation procedure [1, 6] that computes the set of reachable IDs.

For a given PDA $P$ with $q\in Q$ and $X\in\Gamma$ , define the set of reachable IDs $R_{P}(q,X)$ as follows:

[TABLE]

Lemma 3

If $P$ is a UDPDA then the set $\{I\in R_{P}(q,X)\mid\mathit{stack}(I)=\varepsilon\}$ has at most one element for every state $q$ and stack symbol $X$ .

Proof

Let $P$ be a UDPDA with $\Sigma=\{a\}$ . Since $P$ is deterministic we have that $(i)$ for every $q\in Q,X\in\Gamma$ and $b\in\Sigma\cup\{\varepsilon\}$ , ${|{\delta(q,b,X)}|}\leq 1$ and, $(ii)$ for every $q\in Q$ and $X\in\Gamma$ , if $\delta(q,\varepsilon,X)\neq\emptyset$ then $\delta(q,b,X)=\emptyset$ for every $b\in\Sigma$ .

The proof goes by contradiction. Assume that for some state $q$ and stack symbol $X$ , there are two IDs $I_{1}$ and $I_{2}$ in $R_{P}(q,X)$ such that $\mathit{stack}(I_{1})=\mathit{stack}(I_{2})=\varepsilon$ and $\mathit{state}(I_{1})\neq\mathit{state}(I_{2})$ .

Necessarily, there exists three IDs $J$ , $J_{1}$ and $J_{2}$ with $J_{1}\neq J_{2}$ such that the following holds:

[TABLE]

It is routine to check that if $a=b$ then $P$ is not deterministic, a contradiction. Next, we consider the case $a\neq b$ . When $a$ and $b$ are symbols, because $P$ is a unary DPDA, then they are the same, a contradiction. Else if either $a$ or $b$ is $\varepsilon$ then $P$ is not deterministic, a contradiction. We conclude from the previous that when $\mathit{stack}(I_{1})=\mathit{stack}(I_{2})=\varepsilon$ , then necessarily $\mathit{state}(I_{1})=\mathit{state}(I_{2})$ and therefore that the set $\{I\in R_{P}(q,X)\mid\mathit{stack}(I)=\varepsilon\}$ has at most one element. ∎

Intuitively, Lemma 3 shows that, when fixing $q$ and $X$ , there is at most one $q^{\prime}$ such that the triple $[qXq^{\prime}]$ generates a string of terminals. We use this fact to prove the following theorem.

Theorem 4.2

For every UDPDA with $n$ states and $p$ stack symbols, there is a polynomial time algorithm that computes an equivalent CFG with at most $np$ variables.

Proof

The conversion algorithm translating a PDA $P$ to a CFG $G$ computes the set of grammar variables $\{[qXq^{\prime}]\mid q,q^{\prime}\in Q,X\in\Gamma\}$ . By Lemma 3, for each $q$ and $X$ there is at most one variable $[qXq^{\prime}]$ in the previous set generating a string of terminals. The consequence of the lemma is twofold:

(i) For the triples it suffices to compute the subset $T$ of the aforementioned generating variables. Clearly, ${|{T}|}\leq np$ .

(ii) Each action of $P$ now yields a single rule in $G$ . This is because in (2) there is at most one choice for $r_{1}$ to $r_{d}$ , hence we avoid the exponential blowup of the runtime in the conversion algorithm.

To compute $T$ given $P$ , we use the polynomial time saturation procedure [1, 6] which given $(q,X)$ computes a FSA for the set $R_{P}(q,X)$ . Then we compute from this set the unique state $q^{\prime}$ (if any) such that $(q^{\prime},\varepsilon)\in R_{P}(q,X)$ , hence $T$ . From the above we find that, given $P$ , we compute $G$ in polynomial time. ∎

Up to this point, we have assumed the empty stack as the acceptance condition. For general PDA, assuming final states or empty stack as acceptance condition induces no loss of generality. The situation is different for deterministic PDA where accepting by final states is more general than empty stack. For this reason, we contemplate the case where the UDPDA accepts by final states. Theorem 4.3 shows how our previous construction can be modified to accommodate the acceptance condition by final states.

Theorem 4.3

For every UDPDA with $n$ states and $p$ stack symbols that accepts by final states, there is a polynomial time algorithm that computes an equivalent CFG with $\mathcal{O}(np)$ variables.

Proof

Let $P$ be a UDPDA with $n$ states and $p$ stack symbols that accepts by final states. We first translate $P=(Q,\Sigma,\Gamma,\delta,q_{0},Z_{0},F)$ 888The set of final states is given by $F\subseteq Q$ . into a (possibly nondeterministic) unary pushdown automaton $P^{\prime}=(Q^{\prime},\Sigma,\Gamma^{\prime},\delta^{\prime},q^{\prime}_{0},Z^{\prime}_{0})$ with an empty stack acceptance condition. In particular, $Q^{\prime}=Q\cup\{q^{\prime}_{0},\mathit{sink}\}$ ; $\Gamma^{\prime}=\Gamma\cup{Z^{\prime}_{0}}$ ; and $\delta^{\prime}$ is given by

[TABLE]

The new stack symbol $Z^{\prime}_{0}$ is to prevent $P^{\prime}$ from incorrectly accepting when $P$ is in a nonfinal state with an empty stack. The state $\mathit{sink}$ is to empty the stack upon $P$ entering a final state. Observe that $P^{\prime}$ need not be deterministic. Also, it is routine to check that $L(P^{\prime})=L(P)$ and $P^{\prime}$ is computable in time linear in the size of $P$ . Now let us turn to $R_{P^{\prime}}(q,X)$ . For $P^{\prime}$ a weaker version of Lemma 3 holds: the set $H=\{I\in R_{P^{\prime}}(q,X)\mid\mathit{stack}(I)=\varepsilon\}$ has at most two elements for every state $q\in Q^{\prime}$ and stack symbols $X\in\Gamma^{\prime}$ . This is because if $H$ contains two IDs then necessarily one of them has $\mathit{sink}$ for state.

Based on this result, we construct $T$ as in Theorem 4.2, but this time we have that ${|{T}|}$ is $\mathcal{O}(np)$ .

Now we turn to the set of production rules as defined in (2) (see Section 2). We show that each action $(q,X)\hookrightarrow_{b}(q^{\prime},\beta)$ of $P^{\prime}$ yields at most $d$ production rules in $G$ where $d={|{\beta}|}$ . For each state $r_{i}$ in (2) we have two choices, one of which is $\mathit{sink}$ . We also know that once a move sequence enters $\mathit{sink}$ it cannot leave it. Therefore, we have that if $r_{i}=\mathit{sink}$ then $r_{i+1}=\cdots=r_{d}=\mathit{sink}$ . Given an action, it thus yields $d$ production rules one where $r_{1}=\cdots=r_{d}=\mathit{sink}$ , another where $r_{2}=\cdots=r_{d}=\mathit{sink}$ , …, etc. Hence, we avoid the exponential blowup of the runtime in the conversion algorithm.

The remainder of the proof follows that of Theorem 4.2. ∎

5 Parikh-Equivalent Finite State Automata

Parikh’s theorem [11] shows that every context-free language is Parikh-equivalent to a regular language. Using this result, we can compare PDAs against FSAs under Parikh equivalence. We start by deriving some lower bound using the family $P(n,k)$ . Because its alphabet is unary and it accepts a single long word, the comparison becomes straightforward.

Theorem 5.1

For each $n\geq 1$ and $p>2n+4$ , there is a PDA with $n$ states and $p$ stack symbols for which every Parikh-equivalent FSA has at least $2^{n^{2}(p-2n-4)}+1$ states.

Proof

Consider the family of PDAs $P(n,k)$ with $n\geq 1$ and $k\geq 1$ described in Definition 3. Fix $n$ and $k$ and refer to the corresponding member of the family as $P$ . By Lemma 2, $L(P)=\{b^{N}\}$ with $N\geq 2^{n^{2}k}$ . Then, the smallest FSA that is Parikh-equivalent to $L(P)$ needs $N+1$ states. As $k=p-2n-4$ , we conclude that the smallest Parikh-equivalent FSA has at least $2^{n^{2}(p-2n-4)}+1$ states. ∎

Let us now turn to upper bounds. We give a 2-step procedure computing, given a PDA, a Parikh-equivalent FSA. The steps are:

(i) translate the PDA into a language-equivalent context-free grammar [9]; and

(ii) translate the context-free grammar into a Parikh-equivalent finite state automaton [4].

Let us introduce the following definition. A grammar is in 2-1 normal form (2-1-NF for sort) if each rule $(X,\alpha)\in R$ is such that $\alpha$ consists of at most one terminal and at most two variables. It is worth pointing that, when the grammar is in 2-1-NF, the resulting Parikh-equivalent FSA from step (ii) has $\mathcal{O}(4^{n})$ states where $n$ is the number of grammar variables [4]. For the sake of simplicity, we will assume that grammars are in 2-1-NF which holds when PDAs are in reduced form: every move is of the form $(q,X)\hookrightarrow_{b}(q^{\prime},\beta)$ with ${|{\beta}|}\leq 2$ and $b\in\Sigma\cup\{\varepsilon\}$ .

Theorem 5.2

Given a PDA in reduced form with $n\geq 1$ states and $p\geq 1$ stack symbols, there is a Parikh-equivalent FSA with $\mathcal{O}(4^{n^{2}p})$ states.

Proof

The algorithm to convert a PDA with $n\geq 1$ states and $p\geq 1$ stack symbols into a CFG that generates the same language [9] uses at most $n^{2}p+1$ variables if $n>1$ (or $p$ if $n=1$ ). Given a CFG of $n$ variables in 2-1-NF, one can construct a Parikh-equivalent FSA with $\mathcal{O}(4^{n})$ states [4].

Given a PDA $P$ with $n\geq 1$ states and $p\geq 1$ stack symbols the conversion algorithm returns a language-equivalent CFG $G$ . Note that if $P$ is in reduced form, then the conversion algorithm returns a CFG in 2-1-NF. Then, apply to $G$ the known construction that builds a Parikh-equivalent FSA [4]. The resulting FSA has $\mathcal{O}(4^{n^{2}p})$ states. ∎

Remark 2

Theorem 5.1 shows that a every FSA that is Parikh-equivalent to $P(n,k)$ needs $\Omega({2^{n^{2}k}})$ states. On the other hand, Theorem 5.2 shows that the number of states of every Parikh-equivalent FSA is $O(4^{n^{2}(k+2n+4)})$ . Thus, our construction is close to optimal999As the blow up of our construction is $O(4^{n^{2}(k+2n+4)})$ for a lower bound of $2^{n^{2}k}$ , we say that it is close to optimal in the sense that $2n^{2}(k+2n+4)\in\Theta(n^{2}k)$ , which holds when $n$ is in linear relation with respect to $k$ (see Remark 1). when $n$ is in linear relation with respect to $k$ .

We conclude by discussing the reduced form assumption. Its role is to simplify the exposition and, indeed, it is not needed to prove correctness of the 2-step procedure. The assumption can be relaxed and bounds can be inferred. They will contain an additional parameter related to the length of the longest sequence of symbols pushed on the stack.

5.0.1 Acknowledgements

We thank Pedro Valero for pointing out the reference on smallest grammar problems [2]. We also thank the anonymous referees for their insightful comments and suggestions.

Appendix 0.A Appendix

0.A.1 Proof of Lemma 1

Proof

By induction on ${|{t}|}$ .

Base case. Since ${|{t}|}=1$ necessarily $t=a$ and $d(t)=0$ . Hence $1\geq 2^{0}$ .

Inductive case. Let $t={a}(a_{1}(\ldots),\ldots,a_{r}(\dots))$ with $r\geq 1$ . We study two cases. Suppose there is a unique subtree $t_{x}=a_{x}(\ldots)$ of $t$ with $x\in\{1,\ldots,r\}$ such that $d(t_{x})=d(t)$ . As ${|{t_{x}}|}<{|{t}|}$ , the induction hypothesis shows that ${|{t_{x}}|}\geq 2^{d(t_{x})}=2^{d(t)}$ , hence ${|{t}|}\geq 2^{d(t)}$ .

Next, let $r\geq 2$ and suppose there are at least two subtrees $t_{x}=a_{x}(\ldots)$ and $t_{y}=a_{y}(\ldots)$ of $t$ with $x,y\in\{1,\ldots,r\}$ and $x\neq y$ such that $d(t_{x})=d(t_{y})=d(t)-1$ . As ${|{t_{x}}|}<{|{t}|}$ , the induction hypothesis shows that ${|{t_{x}}|}\geq 2^{d(t_{x})}$ . Applying the same reasoning to $t_{y}$ we conclude from ${|{t}|}\geq{|{t_{x}}|}+{|{t_{y}}|}$ that ${|{t}|}\geq 2^{d(t_{x})}+2^{d(t_{y})}=2\cdot 2^{d(t)-1}=2^{d(t)}$ . ∎

0.A.2 Disassembly of Quasi-runs

A quasi-run with more than one move can be disassembled into its first move and subsequent quasi-runs. To this end, we need to introduce a few auxiliary definitions. Given a word $w\in\Sigma^{*}$ and an integer $i$ , define $w_{\mathit{sh}(i)}=(w)_{i+1}\cdots(w)_{i+{|{w}|}}$ . Intuitively, $w$ is shifted $i$ positions to the left if $i\geq 0$ and to the right otherwise. So given $i\geq 0$ , we will conveniently write $w_{\ll_{i}}$ for $w_{\mathit{sh}(i)}$ and $w_{\gg_{i}}$ for $w_{\mathit{sh}(-i)}$ . Moreover, set $w_{\ll}=w_{\ll_{1}}$ . For example, $a_{\ll_{1}}=a_{\gg_{1}}=\varepsilon$ , $abcde_{\ll_{3}}=de$ , $abcde_{\gg_{3}}=ab$ , $w=(w)_{1}\cdots(w)_{i}\,w_{\ll_{i}}$ and $w=w_{\gg_{i}}\,(w)_{{|{w}|}-i+1}\cdots(w)_{{|{w}|}}$ for $i>0$ .

Given an ID $I$ and $i>0$ define $I_{\gg_{i}}=(\mathit{state}(I),\mathit{tape}(I),\mathit{stack}(I)_{\gg_{i}})$ which, intuitively, removes from $I$ the $i$ bottom stack symbols.

Lemma 4 (from [7])

Let $r=I_{0}\vdash{\cdots}\vdash I_{n}$ , be a quasi-run. Then we can disassemble $r$ into its first move $I_{0}\vdash I_{1}$ and $d={|{\mathit{stack}(I_{1})}|}$ quasi-runs $r_{1},\ldots,r_{d}$ each of which is such that

[TABLE]

where $p_{0}\leq p_{1}\leq\cdots\leq p_{d}$ are defined to be the least positions such that $p_{0}=1$ and $\mathit{stack}(I_{p_{i}})=\mathit{stack}(I_{p_{i-1}})_{\ll}$ for all $i$ . Also $n_{i}={|{\mathit{stack}(I_{p_{i}})}|}$ for all $i$ , that is $r_{i}$ is a quasi-run obtained by removing from the move sequence $I_{p_{i-1}}\vdash{\cdots}\vdash I_{p_{i}}$ the $n_{i}$ bottom stack symbols leaving the stack of $I_{p_{i}}$ empty and that of $I_{p_{i-1}}$ with one symbol only. Necessarily, $p_{d}=n$ and each quasi-run $r_{i}$ starts with $(\mathit{stack}(I_{1}))_{i}$ as its initial content.

Example 3

Recall the PDA $P$ described in Example 1. Consider the quasi-run:

[TABLE]

We can dissasemble $r$ into its first move $I_{0}\vdash I_{1}=(q_{0},X_{1})\vdash(q_{0},X_{0}\,X_{0})$ and $d=2$ quasi-runs $r_{1},r_{2}$ such that

[TABLE]

Note that for each quasi-run $r_{i}\,(i=1,2)$ , the stack of $(I_{p_{i}})_{\gg_{n_{i}}}$ is empty and that of $(I_{p_{i-1}})_{\gg_{n_{i}}}$ contains one symbol only. Also, $p_{d}=p_{2}=n=11$ and each $r_{i}$ starts with $(\mathit{stack}(I_{1}))_{i}$ as its initial content.

0.A.3 Assembly of Quasi-runs

Now we show how to assemble a quasi-run from a given action and a list of quasi-runs. We need the following notation: given $I$ and $w\in\Gamma^{*}$ , define $I\bullet w=(\mathit{state}(I),\mathit{stack}(I)\,w)$ .

Lemma 5

Let $a=(q,X)\hookrightarrow(q^{\prime},\beta_{1}\ldots\beta_{d})$ be an action and $r_{1},\ldots,r_{d}$ be $d\geq 0$ quasi-runs with $r_{i}=I^{i}_{0}\vdash I^{i}_{1}\vdash{\cdots}\vdash I^{i}_{n_{i}}$ for all $i$ , such that

•

the first action of $r_{i}$ pops $\beta_{i}$ for every $i$ ;

•

the target state of last action of $r_{i}$ ( $a$ * when $i=0$ ) is the source state of first action of $r_{i+1}$ for all $i\in\{1,\ldots,d-1\}$ .*

Then there exists a quasi-run $r$ given by

[TABLE]

0.A.4 Proof of Theorem 3.1

Proof

To prove the existence of a one-to-one correspondence we show that:

Each quasi-run must be paired with at least one actree, and viceversa. 2. 2.

No quasi-run may be paired with more than one actree, and viceversa.

First, given a quasi-run $r$ of $P$ , define a tree $t$ inductively on the length of $r$ . We prove at the same time that (4) holds for $t$ which we show is an actree.

[TABLE]

For the base case, necessarily $r=I_{0}\vdash I_{1}$ and we define $t$ as the leaf labeled by the action $I_{0}\hookrightarrow I_{1}$ . Clearly, $t$ satisfies (4), hence $t$ is an actree ( $t$ trivially verifies Definition 1).

Now consider the case $r=I_{0}\vdash I_{1}\vdash{\cdots}\vdash I_{n}$ where $n>1$ , we define $t$ as follows. Lemma 4 shows $r$ disassembles into its first action $a$ and $d={|{\mathit{stack}(I_{1})}|}\geq 1$ quasi-runs $r_{1},\ldots,r_{d}$ . The action $a$ labels the root of $t$ which has $d$ children $t_{1}$ to $t_{d}$ . The subtrees $t_{1}$ to $t_{d}$ are defined applying the induction hypothesis on the quasi-runs $r_{1}$ to $r_{d}$ , respectively. From the induction hypothesis, $t_{1}$ to $t_{d}$ are actrees and each sequence ${\overline{t_{i}}}$ coincide with the sequence of actions of the quasi-run $r_{i}$ . Moreover, Lemma 4 shows that the state of the last ID of $r_{i}$ coincides with the state of the first ID of $r_{i+1}$ for all $i\in\{1,\ldots,d-1\}$ . Also, the state of the first ID of $r_{1}$ coincides with the state of $I_{1}$ . We conclude from above that $t$ satisfies (4) and that $t$ is an actree since it verifies Definition 1.

Second, given an actree $t$ of $P$ , we define a move sequence $r$ inductively on the height of $t$ . We prove at the same time that (4) holds for $r$ which we show is a quasi-run. For the base case, we assume $h(t)=0$ . Then, the root of $t$ is a leaf labeled by an action $a=I_{0}\hookrightarrow I_{1}$ and we define $r=I_{0}\vdash I_{1}$ . Clearly, $r$ satisfies (4) and is a quasi-run.

Now, assume that $t$ has $d$ children $t_{1}$ to $t_{d}$ , we define $r$ as follows. By the induction hypothesis, each subtree $t_{i}$ for all $i\in\{1,\dots,d\}$ defines a quasi-run $r_{i}$ verifying (4). The definition of actree shows that the root of $t$ pushes $\beta_{1}$ to $\beta_{d}$ which are popped by its $d$ children. By induction hypothesis each $r_{i}$ for all $i\in\{1,\dots,d\}$ thus starts by popping $\beta_{i}$ . Next it follows from the induction hypothesis and the definition of actree that the target state of the action given by the last move of $r_{i}$ coincides with the source state of the action given by the first move of $r_{i+1}$ for all $i\in\{1,\ldots,d-1\}$ . Moreover, the target state of $a$ coincides with the source state of the action given by the first move of $r_{1}$ . Thus, applying Lemma 5 to the action given by the root of $t$ and $r_{1},\ldots,r_{d}$ yields the quasi-run $r$ that satisfies (4) following our previous remarks.

First, we prove that no quasi-run may be paired with more than one actree. The proof goes by contradiction. Given a move sequence $I_{0}\vdash\ldots\vdash I_{n}$ , define its sequence of actions $a_{1}\ldots a_{n}$ such that the move $I_{i}\vdash I_{i+1}$ is given by the action $a_{i+1}$ , for all $i$ . Note that two quasi-runs $r=I_{0}\vdash\ldots\vdash I_{n}$ and $r^{\prime}=I^{\prime}_{0}\vdash\ldots\vdash I^{\prime}_{m}$ are equal iff their sequences of actions coincide.

Suppose that given the actrees $t$ and $t^{\prime}$ with $t\neq t^{\prime}$ , there exist two quasi-runs $r$ and $r^{\prime}$ such that $r$ is paired with $t$ and $r^{\prime}$ is paired with $t^{\prime}$ , under the relation we described in part 1. of this proof, and $r=r^{\prime}$ . Let ${\overline{t}}=a_{1},\ldots,a_{n}$ and ${\overline{t^{\prime}}}=a^{\prime}_{1},\ldots,a^{\prime}_{m}$ . Let $p\in\{1,\dots,min(n,m)\}$ be the least position in both sequences such that $a_{p}\neq a^{\prime}_{p}$ . By (4), the sequences of actions of $r$ and $r^{\prime}$ also differ at position $p$ (at least). Thus, $r\neq r^{\prime}$ (contradiction).

Second, we prove that no actree may be paired with more than one quasi-run. Again, we give a proof by contradiction.

Suppose that given the quasi-runs $r$ and $r^{\prime}$ with $r\neq r^{\prime}$ , there exist two actrees $t$ and $t^{\prime}$ such that $t$ is paired with $r$ and $t^{\prime}$ is paired with $r^{\prime}$ , under the relation we described in part 1. of the proof, and $t=t^{\prime}$ . We rely on the standard definition of equality between labeled trees.

Suppose $a_{1}\ldots a_{n}$ is the sequence of actions of $r$ and $a^{\prime}_{1}\ldots a^{\prime}_{m}$ is the sequence of actions of $r^{\prime}$ . Let $p\in\{1,\dots,min(n,m)\}$ be the least position such that $a_{p}\neq a^{\prime}_{p}$ . By (4), ${\overline{t}}$ and ${\overline{t^{\prime}}}$ also differ at position $p$ (at least). Then, $t\neq t^{\prime}$ (contradiction).

∎

0.A.5 Example: Accepting Actree of $P(2,1)$

We give a graphical depiction of the accepting actree $t$ of $P(2,1)$ . Recall that $P(2,1)$ corresponds to the member of the family $P(n,k)$ that has $2$ states $q_{0}$ and $q_{1}$ , and $9$ stack symbols $S,X_{0},X_{1},s_{0},s_{1},r_{0},r_{1},\star$ and $\$$. Figure [2](#Pt0.A1.F2) represents$ t$ which has been split for layout reasons.

Bibliography11

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] A. Bouajjani, J. Esparza, and O. Maler. Reachability analysis of pushdown automata: Application to model-checking. In CONCUR , pages 135–150. Springer, 1997.
2[2] M. Charikar, E. Lehman, D. Liu, R. Panigrahy, M. Prabhakaran, A. Sahai, and A. Shelat. The smallest grammar problem. IEEE Transactions on Information Theory , 51(7):2554–2576, 2005.
3[3] D. Chistikov and R. Majumdar. Unary pushdown automata and straight-line programs. In ICALP , pages 146–157. Springer, 2014.
4[4] J. Esparza, P. Ganty, S. Kiefer, and M. Luttenberger. Parikh’s theorem: A simple and direct automaton construction. IPL , pages 614–619, 2011.
5[5] J. Esparza, M. Luttenberger, and M. Schlund. A brief history of strahler numbers. In LATA , pages 1–13. Springer, 2014.
6[6] A. Finkel, B. Willems, and P. Wolper. A direct symbolic approach to model checking pushdown systems (extended abstract). Electronic Notes in Theoretical Computer Science , 9:27–37, 1997.
7[7] P. Ganty and D. Valput. Bounded-oscillation pushdown automata. EPTCS , pages 178–197, 2016. Gand ALF.
8[8] J. Goldstine, J. K. Price, and D. Wotschke. A pushdown automaton or a context-free grammar: Which is more economical? Theoretical Computer Science , pages 33–40, 1982.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Parikh Image of Pushdown Automata

Abstract

1 Introduction

Structure of the paper.

2 Preliminaries

3 A Tree-Based Semantics for Pushdown Automata

Definition 1

Example 1

Definition 2

Example 2

Lemma 1

Theorem 3.1

4 Parikh-Equivalent Context-free Grammars

4.1 The Family P(n,k)P(n,k)P(n,k) of PDAs

Definition 3

Lemma 2

Proof

Theorem 4.1

Proof

Remark 1

4.2 The Case of Unary Deterministic Pushdown Automata

Lemma 3

Proof

Theorem 4.2

Proof

Theorem 4.3

Proof

5 Parikh-Equivalent Finite State Automata

Theorem 5.1

Proof

Theorem 5.2

Proof

Remark 2

5.0.1 Acknowledgements

Appendix 0.A Appendix

0.A.1 Proof of Lemma 1

Proof

0.A.2 Disassembly of Quasi-runs

Lemma 4 (from [7])

Example 3

0.A.3 Assembly of Quasi-runs

Lemma 5

0.A.4 Proof of Theorem 3.1

Proof

0.A.5 Example: Accepting Actree of P(2,1)P(2,1)P(2,1)

4.1 The Family $P(n,k)$ of PDAs

0.A.5 Example: Accepting Actree of $P(2,1)$