Parikh Image of Pushdown Automata
Pierre Ganty, Elena Guti\'errez

TL;DR
This paper analyzes the complexity of converting pushdown automata to context-free grammars and finite automata, establishing optimal bounds and exploring Parikh equivalence, especially for unary automata and deterministic cases.
Contribution
It proves the optimality of classical PDA-to-grammar conversion algorithms for unary automata and introduces a new efficient conversion for unary deterministic PDAs.
Findings
Conversion algorithm is optimal for unary PDAs.
Parikh equivalence simplifies automata comparison.
New bounds for unary deterministic PDA to grammar conversion.
Abstract
We compare pushdown automata (PDAs for short) against other representations. First, we show that there is a family of PDAs over a unary alphabet with states and stack symbols that accepts one single long word for which every equivalent context-free grammar needs variables. This family shows that the classical algorithm for converting a PDA to an equivalent context-free grammar is optimal even when the alphabet is unary. Moreover, we observe that language equivalence and Parikh equivalence, which ignores the ordering between symbols, coincide for this family. We conclude that, when assuming this weaker equivalence, the conversion algorithm is also optimal. Second, Parikh's theorem motivates the comparison of PDAs against finite state automata. In particular, the same family of unary PDAs gives a lower bound on the number of states of every…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
11institutetext: IMDEA Software Institute, Madrid, Spain 22institutetext: Universidad Politécnica de Madrid, Spain 22email: {pierre.ganty,elena.gutierrez}@imdea.org
Parikh Image of Pushdown Automata
Pierre Ganty*,* Pierre Ganty has been supported by the Madrid Regional Government project S2013/ICE-2731, N-Greens Software - Next-GeneRation Energy-EfficieNt Secure Software, and the Spanish Ministry of Economy and Competitiveness project No. TIN2015-71819-P, RISCO - RIgorous analysis of Sophisticated COncurrent and distributed systems.11
Elena Gutiérrez*,* Elena Gutiérrez is partially supported by BES-2016-077136 grant from the Spanish Ministry of Economy, Industry and Competitiveness.1122
Abstract
We compare pushdown automata (PDAs for short) against other representations. First, we show that there is a family of PDAs over a unary alphabet with states and stack symbols that accepts one single long word for which every equivalent context-free grammar needs variables. This family shows that the classical algorithm for converting a PDA to an equivalent context-free grammar is optimal even when the alphabet is unary. Moreover, we observe that language equivalence and Parikh equivalence, which ignores the ordering between symbols, coincide for this family. We conclude that, when assuming this weaker equivalence, the conversion algorithm is also optimal. Second, Parikh’s theorem motivates the comparison of PDAs against finite state automata. In particular, the same family of unary PDAs gives a lower bound on the number of states of every Parikh-equivalent finite state automaton. Finally, we look into the case of unary deterministic PDAs. We show a new construction converting a unary deterministic PDA into an equivalent context-free grammar that achieves best known bounds.
1 Introduction
Given a context-free language which representation, pushdown automata or context-free grammars, is more concise? This was the main question studied by Goldstine et al. [8] in a paper where they introduced an infinite family of context-free languages whose representation by a pushdown automaton is more concise than by context-free grammars. In particular, they showed that each language of the family is accepted by a pushdown automaton with states and stack symbols, but every context-free grammar needs at least variables if ( if ). Incidentally, the family shows that the translation of a pushdown automaton into an equivalent context-free grammar used in textbooks [9], which uses the same large number of variables if ( if ), is optimal in the sense that there is no other algorithm that always produces fewer grammar variables.
Today we revisit these questions but this time we turn our attention to the unary case. We define an infinite family of context-free languages as Goldstine et al. did but our family differs drastically from theirs. Given and , each member of our family is given by a PDA with states, stack symbols and a single input symbol.111Their family has an alphabet of non-constant size. We show that, for each PDA of the family, every equivalent context-free grammar has variables. Therefore, this family shows that the textbook translation of a PDA into a language-equivalent context-free grammar is optimal222In a sense that we will precise in Section 4 (Remark 1). even when the alphabet is unary. Note that if the alphabet is a singleton, equality over words (two words are equal if the same symbols appear at the same positions) coincides with Parikh equivalence (two words are Parikh-equivalent if each symbol occurs equally often in both words333But not necessarily at the same positions, e.g. and are Parikh-equivalent.). Thus, we conclude that the conversion algorithm is also optimal for Parikh equivalence. We also investigate the special case of deterministic PDAs over a singleton alphabet for which equivalent context-free grammar representations of small size had been defined [3, 10]. We give a new definition for an equivalent context-free grammar given a unary deterministic PDA. Our definition is constructive (as far as we could tell the result of Pighizzini [10] is not) and achieves the best known bounds [3] by combining two known constructions.
Parikh’s theorem [11] states that every context-free language has the same Parikh image as some regular language. This allows us to compare PDAs against finite state automata (FSAs for short) for Parikh-equivalent languages. First, we use the same family of PDAs to derive a lower bound on the number of states of every Parikh-equivalent FSA. The comparison becomes simple as its alphabet is unary and it accepts one single word. Second, using this lower bound we show that the 2-step procedure chaining existing constructions:
(i) translate the PDA into a language-equivalent context-free grammar [9]; and
(ii) translate the context-free grammar into a Parikh-equivalent FSA [4]
yields optimal444In a sense that we will precise in Section 5 (Remark 2). results in the number of states of the resulting FSA.
As a side contribution, we introduce a semantics of PDA runs as trees that we call actrees. The richer tree structure (compared to a sequence) makes simpler to compare each PDA of the family with its smallest grammar representation.
Structure of the paper.
After preliminaries in Section 2 we introduce the tree-based semantics in 3. In Section 4 we compare PDAs and context-free grammars when they represent Parikh-equivalent languages. We will define the infinite family of PDAs and establish their main properties. We dedicate Section 4.2 to the special case of deterministic PDAs over a unary alphabet. Finally, Section 5 focuses on the comparison of PDAs against finite state automata for Parikh-equivalent languages.
2 Preliminaries
A pushdown automaton (or PDA) is a 6-tuple where is a finite nonempty set of states including , the initial state; is the input alphabet; is the stack alphabet including , the initial stack symbol; and is a finite subset of called the actions. We write to denote an action . We sometimes omit the subscript to the arrow.
An instantaneous description (or ID) of a PDA is a pair where and . We call the first component of an ID the state and the second the stack content. The initial ID consists of the initial state and the initial stack symbol for the stack content. When reasoning formally, we use the functions and which, given an ID, returns its state and stack content, respectively.
An action is enabled at ID if and .555 is the -th symbol of if ; else . is the length of . Given an ID enabling , define the successor ID to be . We denote this fact as , and call it a move that consumes from the input.666When the move does not consume input. We sometimes omit the subscript of when the input consumed (if any) is not important. Given , a move sequence, denoted , is a finite sequence of IDs such that for all . The move sequence consumes (from the input) when . We concisely denote this fact as . A move sequence is a quasi-run when and ; and a run when, furthermore, is the initial ID. Define the language of a PDA as .
The Parikh image of a word over an alphabet , denoted by , is the vector such that is the number of occurrences of in . The Parikh image of a language , denoted by , is the set of Parikh images of its words. When , we say and are Parikh-equivalent.
We assume the reader is familiar with the basics of finite state automata (or FSA for short) and context-free grammars (or CFG). Nevertheless we fix their notation as follows. We denote a FSA as a tuple where is a finite set of states including the initial state and the final states ; is the input alphabet and is the set of transitions. We denote a CFG as a tuple where is a finite set of variables including the start variable, is the alphabet or set of terminals and is a finite set of rules. Rules are conveniently denoted . Given a FSA and a CFG we denote their languages as and , respectively.
Finally, let us recall the translation of a PDA into an equivalent CFG.
Given a PDA , define the CFG where
- •
The set of variables — often called the triples — is given by
[TABLE]
- •
The set of production rules is given by
[TABLE]
For a proof of correctness, see the textbook of Ullman et al. [9]. The previous definition easily translates into a conversion algorithm. Observe that the runtime of such algorithm depends polynomially on and , but exponentially on .
3 A Tree-Based Semantics for Pushdown Automata
In this section we introduce a tree-based semantics for PDA. Using trees instead of sequences sheds the light on key properties needed to present our main results.
Given an action denoted by , is the source state of , the target state of , the symbol pops and the (possibly empty) sequence of symbols pushes.
A labeled tree is a finite tree whose nodes are labeled, where is the label of the root and are labeled trees, the children of the root. When we prefer to write instead of . Each labeled tree defines a sequence, denoted , obtained by removing the symbols ‘(’, ‘)’ or ‘,’ when interpreting as a string, e.g. . The size of a labeled tree , denoted , is given by . It coincides with the number of nodes in .
Definition 1
Given a PDA , an action-tree (or actree for short) is a labeled tree where is an action of pushing with and each children is an actree such that pops for all . Furthermore, an actree must satisfy that the source state of and the target state of coincide for every .
An actree consumes an input resulting from replacing each action in the sequence by the symbol it consumes (or , if the action does not consume any). An actree is accepting if the initial ID enables .
Example 1
Consider a PDA with actions to respectively given by , , ,
and . The reader can check that the actree , depicted in Figure 1, satisfies the conditions of Definition 1 where , and the input consumed is .
We recall the notion of dimension of a labeled tree [5] and we relate dimension and size of labeled trees in Lemma 1.
Definition 2
The dimension of a labeled tree , denoted as , is inductively defined as follows. if , otherwise we have for some and
[TABLE]
Example 2
The annotation shows the actree of Example 1 has dimension
[TABLE]
Lemma 1
* for every labeled tree .*
The proof of the lemma is given in the Appendix. The actrees and the quasi-runs of a PDAs are in one-to-one correspondence as reflected in Theorem 3.1 whose proof is in the Appendix.
Theorem 3.1
Given a PDA, its actrees and quasi-runs are in a one-to-one correspondence.
4 Parikh-Equivalent Context-free Grammars
In this section we compare PDAs against CFGs when they describe Parikh-equivalent languages. We first study the general class of (nondeterministic) PDAs and, in Section 4.2, we look into the special case of unary deterministic PDAs.
We prove that, for every and , there exists a PDA with states and stack symbols for which every Parikh-equivalent CFG has variables. To this aim, we present a family of PDAs where and . Each member of the family has states and stack symbols, and accepts one single word over a unary input alphabet.
4.1 The Family of PDAs
Definition 3
Given natural values and , define the PDA with states , input alphabet , stack alphabet \Gamma=\{S,\,\star,\,\}\cup{X_{i}\mid 0\leq i\leq k}\cup{s_{i}\mid 0\leq i\leq n-1}\cup{r_{i}\mid 0\leq i\leq n-1}q_{0}S\delta$
[TABLE]
Lemma 2
Given and , has a single accepting actree consuming input where .
Proof
Fix values and and refer to the member of the family as . We show that has exactly one accepting actree. We define a witness labeled tree inductively on the structure of the tree. Later we will prove that the induction is finite. First, we show how to construct the root and its children subtrees. This corresponds to case 1 below. Then, each non-leaf subtree is defined inductively in cases 2 to 5. Note that each non-leaf subtree of falls into one (and only one) of the cases. In fact, all cases are disjoint, in particular 2, 4 and 5. The reverse is also true: all cases describe a non-leaf subtree that does occur in . Finally, we show that each case describes uniquely how to build the next layer of children subtrees of a given non-leaf subtree.
where and and are of the form:
[TABLE]
Note that the initial ID enables which is the only action of with this property. Note also that holds, where . 2. 2.
Each subtree whose root is labeled with and has the form where
[TABLE]
Assume for now that is unique. Therefore, as the 1st and 4th child of share the same label , they also root the same subtree. Thus, it holds ()
[TABLE] 3. 3.
Each subtree whose root is labeled a=(q_{i},X_{0})\hookrightarrow_{b}(q_{i+1},X_{k}\,\)i\in{0,\dots,n-2}a(a_{1}(\ldots),a_{2})$ where
[TABLE]
Note that holds, where . 4. 4.
Each subtree whose root is labeled with and has the form
[TABLE]
where
[TABLE]
Assume is given by the action (q_{i},X_{0})\hookrightarrow_{b}(q_{i+1},X_{k}\,\)$q_{n-1}a_{2}m<n-1$.
Again, assume for now that is unique. Hence, as the 1st and 4th child of are both labeled by , they root the same subtree. Thus, it holds ()
[TABLE] 5. 5.
Each subtree whose root is labeled with has the form where
[TABLE]
For both cases ( and ), assume is given by instead. Then, the action popping must end up in the state in order to enable , i.e., it must be of the form . Hence the action popping must be of the form where necessarily , a contradiction (the stack symbol is not defined in ).
Assume for now that is unique. Then, as the 1st and 4th child of are labeled by , they root the same subtree (possibly a leaf). Thus, it holds ()
[TABLE]
We now prove that is finite by contradiction. Suppose is an infinite tree. König’s Lemma shows that has thus at least one infinite path, say , from the root. As the set of labels of is finite then some label must repeat infinitely often along . Let us define a strict partial order between the labels of the non-leaf subtrees of . We restrict to the non-leaf subtrees because no infinite path contains a leaf subtree. Let and be two non-leaf subtrees of . Let be the source state of and be the target state of the last action in the sequence . Define similarly for . Let be the symbol that pops and be the symbol that pops. Define iff
(a) either ,
(b) or and ,
(c) or and .
First, note that the label of the root of (case 1) only occurs in the root as there is no action of pushing . Second, relying on cases 2 to 5, we observe that every pair of non-leaf subtrees and (excluding the root) such that is the parent node of verifies . Using the transitive property of the strict partial order , we conclude that everypair of subtrees and in such that verifies . Therefore, no repeated variable can occur in (contradiction). We conclude that is finite.
The reader can observe that verifies all conditions of the definition of actree (Definition 1) and the initial ID enables , thus it is an accepting actree of . Since we also showed that no other tree can be defined using the actions of , is unique.
Finally, we give a lower bound on the length of the word consumed by . To this aim, we prove that . Then since all actions consume input symbol , Lemma 1 shows that the word consumed is such that .
Note that, if a subtree of verifies case or , its dimension remains the same w.r.t. its children subtrees. Otherwise, the dimension always grows. Recall that all cases from 1 to 5 describe a set of labels that does occur in . Also, as is unique, no path from the root to a leaf repeats a label. Thus, to compute the dimension of is enough to count the number of distinct labels of that are included in cases , and , which is equivalent to compute the size of the set
[TABLE]
Clearly from which we conclude that . Hence, and therefore consumes a word where since each action of consumes a . ∎
The reader can find in the Appendix a depiction of the accepting actree corresponding to .
Theorem 4.1
For each and , there is a PDA with states and stack symbols for which every Parikh-equivalent CFG has variables.
Proof
Consider the family of PDAs with and described in Definition 3. Fix and and refer to the corresponding member of the family as .
First, Lemma 2 shows that consists of a single word with . It follows that a language is Parikh-equivalent to iff is language-equivalent to .
Let be a CFG such that . The smallest CFG that generates exactly one word of length has size [2, Lemma 1], where the size of a grammar is the sum of the length of all the rules. It follows that is of size . As , then has size . We conclude that has variables. ∎
Remark 1
According to the classical conversion algorithm, every CFG that is equivalent to needs at least variables. On the other hand, Theorem 4.1 shows that a lower bound for the number of variables is . We observe that, as long as for some positive constant , the family shows that the conversion algorithm is optimal 777Note that if for some then the addend in becomes negligible compared to , and the lower and upper bound coincide. in the number of variables when assuming both language and Parikh equivalence. Otherwise, the algorithm is not optimal as there exists a gap between the lower bound and the upper bound. For instance, if then the upper bound is while the lower bound is .
4.2 The Case of Unary Deterministic Pushdown Automata
We have seen that the classical translation from PDA to CFG is optimal in the number of grammar variables for the family of unary nondeterministic PDA when is in linear relation with respect to (see Remark 1). However, for unary deterministic PDA (UDPDA for short) the situation is different. Pighizzini [10] shows that for every UDPDA with states and stack symbols, there exists an equivalent CFG with at most variables. Although he gives a definition of such a grammar, we were not able to extract an algorithm from it. On the other hand, Chistikov and Majumdar [3] give a polynomial time algorithm that transforms a UDPDA into an equivalent CFG going through the construction of a pair of straight-line programs. The size of the resulting CFG is linear in that of the UDPDA.
We propose a new polynomial time algorithm that converts a UDPDA with states and stack symbols into an equivalent CFG with variables. Our algorithm is based on the observation that the conversion algorithm from PDAs to CFGs need not consider all the triples in (1). We discard unnecessary triples using the saturation procedure [1, 6] that computes the set of reachable IDs.
For a given PDA with and , define the set of reachable IDs as follows:
[TABLE]
Lemma 3
If is a UDPDA then the set has at most one element for every state and stack symbol .
Proof
Let be a UDPDA with . Since is deterministic we have that for every and , and, for every and , if then for every .
The proof goes by contradiction. Assume that for some state and stack symbol , there are two IDs and in such that and .
Necessarily, there exists three IDs , and with such that the following holds:
[TABLE]
It is routine to check that if then is not deterministic, a contradiction. Next, we consider the case . When and are symbols, because is a unary DPDA, then they are the same, a contradiction. Else if either or is then is not deterministic, a contradiction. We conclude from the previous that when , then necessarily and therefore that the set has at most one element. ∎
Intuitively, Lemma 3 shows that, when fixing and , there is at most one such that the triple generates a string of terminals. We use this fact to prove the following theorem.
Theorem 4.2
For every UDPDA with states and stack symbols, there is a polynomial time algorithm that computes an equivalent CFG with at most variables.
Proof
The conversion algorithm translating a PDA to a CFG computes the set of grammar variables . By Lemma 3, for each and there is at most one variable in the previous set generating a string of terminals. The consequence of the lemma is twofold:
(i) For the triples it suffices to compute the subset of the aforementioned generating variables. Clearly, .
(ii) Each action of now yields a single rule in . This is because in (2) there is at most one choice for to , hence we avoid the exponential blowup of the runtime in the conversion algorithm.
To compute given , we use the polynomial time saturation procedure [1, 6] which given computes a FSA for the set . Then we compute from this set the unique state (if any) such that , hence . From the above we find that, given , we compute in polynomial time. ∎
Up to this point, we have assumed the empty stack as the acceptance condition. For general PDA, assuming final states or empty stack as acceptance condition induces no loss of generality. The situation is different for deterministic PDA where accepting by final states is more general than empty stack. For this reason, we contemplate the case where the UDPDA accepts by final states. Theorem 4.3 shows how our previous construction can be modified to accommodate the acceptance condition by final states.
Theorem 4.3
For every UDPDA with states and stack symbols that accepts by final states, there is a polynomial time algorithm that computes an equivalent CFG with variables.
Proof
Let be a UDPDA with states and stack symbols that accepts by final states. We first translate 888The set of final states is given by . into a (possibly nondeterministic) unary pushdown automaton with an empty stack acceptance condition. In particular, ; ; and is given by
[TABLE]
The new stack symbol is to prevent from incorrectly accepting when is in a nonfinal state with an empty stack. The state is to empty the stack upon entering a final state. Observe that need not be deterministic. Also, it is routine to check that and is computable in time linear in the size of . Now let us turn to . For a weaker version of Lemma 3 holds: the set has at most two elements for every state and stack symbols . This is because if contains two IDs then necessarily one of them has for state.
Based on this result, we construct as in Theorem 4.2, but this time we have that is .
Now we turn to the set of production rules as defined in (2) (see Section 2). We show that each action of yields at most production rules in where . For each state in (2) we have two choices, one of which is . We also know that once a move sequence enters it cannot leave it. Therefore, we have that if then . Given an action, it thus yields production rules one where , another where , …, etc. Hence, we avoid the exponential blowup of the runtime in the conversion algorithm.
The remainder of the proof follows that of Theorem 4.2. ∎
5 Parikh-Equivalent Finite State Automata
Parikh’s theorem [11] shows that every context-free language is Parikh-equivalent to a regular language. Using this result, we can compare PDAs against FSAs under Parikh equivalence. We start by deriving some lower bound using the family . Because its alphabet is unary and it accepts a single long word, the comparison becomes straightforward.
Theorem 5.1
For each and , there is a PDA with states and stack symbols for which every Parikh-equivalent FSA has at least states.
Proof
Consider the family of PDAs with and described in Definition 3. Fix and and refer to the corresponding member of the family as . By Lemma 2, with . Then, the smallest FSA that is Parikh-equivalent to needs states. As , we conclude that the smallest Parikh-equivalent FSA has at least states. ∎
Let us now turn to upper bounds. We give a 2-step procedure computing, given a PDA, a Parikh-equivalent FSA. The steps are:
(i) translate the PDA into a language-equivalent context-free grammar [9]; and
(ii) translate the context-free grammar into a Parikh-equivalent finite state automaton [4].
Let us introduce the following definition. A grammar is in 2-1 normal form (2-1-NF for sort) if each rule is such that consists of at most one terminal and at most two variables. It is worth pointing that, when the grammar is in 2-1-NF, the resulting Parikh-equivalent FSA from step (ii) has states where is the number of grammar variables [4]. For the sake of simplicity, we will assume that grammars are in 2-1-NF which holds when PDAs are in reduced form: every move is of the form with and .
Theorem 5.2
Given a PDA in reduced form with states and stack symbols, there is a Parikh-equivalent FSA with states.
Proof
The algorithm to convert a PDA with states and stack symbols into a CFG that generates the same language [9] uses at most variables if (or if ). Given a CFG of variables in 2-1-NF, one can construct a Parikh-equivalent FSA with states [4].
Given a PDA with states and stack symbols the conversion algorithm returns a language-equivalent CFG . Note that if is in reduced form, then the conversion algorithm returns a CFG in 2-1-NF. Then, apply to the known construction that builds a Parikh-equivalent FSA [4]. The resulting FSA has states. ∎
Remark 2
Theorem 5.1 shows that a every FSA that is Parikh-equivalent to needs states. On the other hand, Theorem 5.2 shows that the number of states of every Parikh-equivalent FSA is . Thus, our construction is close to optimal999As the blow up of our construction is for a lower bound of , we say that it is close to optimal in the sense that , which holds when is in linear relation with respect to (see Remark 1). when is in linear relation with respect to .
We conclude by discussing the reduced form assumption. Its role is to simplify the exposition and, indeed, it is not needed to prove correctness of the 2-step procedure. The assumption can be relaxed and bounds can be inferred. They will contain an additional parameter related to the length of the longest sequence of symbols pushed on the stack.
5.0.1 Acknowledgements
We thank Pedro Valero for pointing out the reference on smallest grammar problems [2]. We also thank the anonymous referees for their insightful comments and suggestions.
Appendix 0.A Appendix
0.A.1 Proof of Lemma 1
Proof
By induction on .
Base case. Since necessarily and . Hence .
Inductive case. Let with . We study two cases. Suppose there is a unique subtree of with such that . As , the induction hypothesis shows that , hence .
Next, let and suppose there are at least two subtrees and of with and such that . As , the induction hypothesis shows that . Applying the same reasoning to we conclude from that . ∎
0.A.2 Disassembly of Quasi-runs
A quasi-run with more than one move can be disassembled into its first move and subsequent quasi-runs. To this end, we need to introduce a few auxiliary definitions. Given a word and an integer , define . Intuitively, is shifted positions to the left if and to the right otherwise. So given , we will conveniently write for and for . Moreover, set . For example, , , , and for .
Given an ID and define which, intuitively, removes from the bottom stack symbols.
Lemma 4 (from [7])
Let , be a quasi-run. Then we can disassemble into its first move and quasi-runs each of which is such that
[TABLE]
where are defined to be the least positions such that and for all . Also for all , that is is a quasi-run obtained by removing from the move sequence the bottom stack symbols leaving the stack of empty and that of with one symbol only. Necessarily, and each quasi-run starts with as its initial content.
Example 3
Recall the PDA described in Example 1. Consider the quasi-run:
[TABLE]
We can dissasemble into its first move and quasi-runs such that
[TABLE]
Note that for each quasi-run , the stack of is empty and that of contains one symbol only. Also, and each starts with as its initial content.
0.A.3 Assembly of Quasi-runs
Now we show how to assemble a quasi-run from a given action and a list of quasi-runs. We need the following notation: given and , define .
Lemma 5
Let be an action and be quasi-runs with for all , such that
- •
the first action of pops for every ;
- •
the target state of last action of (* when ) is the source state of first action of for all .*
Then there exists a quasi-run given by
[TABLE]
0.A.4 Proof of Theorem 3.1
Proof
To prove the existence of a one-to-one correspondence we show that:
Each quasi-run must be paired with at least one actree, and viceversa. 2. 2.
No quasi-run may be paired with more than one actree, and viceversa.
- First, given a quasi-run of , define a tree inductively on the length of . We prove at the same time that (4) holds for which we show is an actree.
[TABLE]
For the base case, necessarily and we define as the leaf labeled by the action . Clearly, satisfies (4), hence is an actree ( trivially verifies Definition 1).
Now consider the case where , we define as follows. Lemma 4 shows disassembles into its first action and quasi-runs . The action labels the root of which has children to . The subtrees to are defined applying the induction hypothesis on the quasi-runs to , respectively. From the induction hypothesis, to are actrees and each sequence coincide with the sequence of actions of the quasi-run . Moreover, Lemma 4 shows that the state of the last ID of coincides with the state of the first ID of for all . Also, the state of the first ID of coincides with the state of . We conclude from above that satisfies (4) and that is an actree since it verifies Definition 1.
Second, given an actree of , we define a move sequence inductively on the height of . We prove at the same time that (4) holds for which we show is a quasi-run. For the base case, we assume . Then, the root of is a leaf labeled by an action and we define . Clearly, satisfies (4) and is a quasi-run.
Now, assume that has children to , we define as follows. By the induction hypothesis, each subtree for all defines a quasi-run verifying (4). The definition of actree shows that the root of pushes to which are popped by its children. By induction hypothesis each for all thus starts by popping . Next it follows from the induction hypothesis and the definition of actree that the target state of the action given by the last move of coincides with the source state of the action given by the first move of for all . Moreover, the target state of coincides with the source state of the action given by the first move of . Thus, applying Lemma 5 to the action given by the root of and yields the quasi-run that satisfies (4) following our previous remarks.
- First, we prove that no quasi-run may be paired with more than one actree. The proof goes by contradiction. Given a move sequence , define its sequence of actions such that the move is given by the action , for all . Note that two quasi-runs and are equal iff their sequences of actions coincide.
Suppose that given the actrees and with , there exist two quasi-runs and such that is paired with and is paired with , under the relation we described in part 1. of this proof, and . Let and . Let be the least position in both sequences such that . By (4), the sequences of actions of and also differ at position (at least). Thus, (contradiction).
Second, we prove that no actree may be paired with more than one quasi-run. Again, we give a proof by contradiction.
Suppose that given the quasi-runs and with , there exist two actrees and such that is paired with and is paired with , under the relation we described in part 1. of the proof, and . We rely on the standard definition of equality between labeled trees.
Suppose is the sequence of actions of and is the sequence of actions of . Let be the least position such that . By (4), and also differ at position (at least). Then, (contradiction).
∎
0.A.5 Example: Accepting Actree of
We give a graphical depiction of the accepting actree of . Recall that corresponds to the member of the family that has states and , and stack symbols and \$$. Figure [2](#Pt0.A1.F2) represents t$ which has been split for layout reasons.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] A. Bouajjani, J. Esparza, and O. Maler. Reachability analysis of pushdown automata: Application to model-checking. In CONCUR , pages 135–150. Springer, 1997.
- 2[2] M. Charikar, E. Lehman, D. Liu, R. Panigrahy, M. Prabhakaran, A. Sahai, and A. Shelat. The smallest grammar problem. IEEE Transactions on Information Theory , 51(7):2554–2576, 2005.
- 3[3] D. Chistikov and R. Majumdar. Unary pushdown automata and straight-line programs. In ICALP , pages 146–157. Springer, 2014.
- 4[4] J. Esparza, P. Ganty, S. Kiefer, and M. Luttenberger. Parikh’s theorem: A simple and direct automaton construction. IPL , pages 614–619, 2011.
- 5[5] J. Esparza, M. Luttenberger, and M. Schlund. A brief history of strahler numbers. In LATA , pages 1–13. Springer, 2014.
- 6[6] A. Finkel, B. Willems, and P. Wolper. A direct symbolic approach to model checking pushdown systems (extended abstract). Electronic Notes in Theoretical Computer Science , 9:27–37, 1997.
- 7[7] P. Ganty and D. Valput. Bounded-oscillation pushdown automata. EPTCS , pages 178–197, 2016. Gand ALF.
- 8[8] J. Goldstine, J. K. Price, and D. Wotschke. A pushdown automaton or a context-free grammar: Which is more economical? Theoretical Computer Science , pages 33–40, 1982.
