Pushing for weighted tree automata
Thomas Hanneforth, Andreas Maletti, Daniel Quernheim

TL;DR
This paper introduces a weight normalization procedure called pushing for weighted tree automata over commutative semifields, enabling efficient minimization and equivalence testing, especially for bottom-up deterministic automata.
Contribution
It presents a novel pushing normalization method for weighted tree automata that preserves recognized languages and improves equivalence testing efficiency.
Findings
Normalization preserves recognized weighted tree languages.
New equivalence test runs in near-linear time, improving over previous methods.
Reduction to unweighted automata simplifies minimization and testing.
Abstract
A weight normalization procedure, commonly called pushing, is introduced for weighted tree automata (wta) over commutative semifields. The normalization preserves the recognized weighted tree language even for nondeterministic wta, but it is most useful for bottom-up deterministic wta, where it can be used for minimization and equivalence testing. In both applications a careful selection of the weights to be redistributed followed by normalization allows a reduction of the general problem to the corresponding problem for bottom-up deterministic unweighted tree automata. This approach was already successfully used by Mohri and Eisner for the minimization of deterministic weighted string automata. Moreover, the new equivalence test for two wta and runs in time , where and are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topicssemigroups and automata theory · Natural Language Processing Techniques · Machine Learning and Algorithms
\lmcsheading
1–LABEL:LastPageFeb. 02, 2017Jan. 16, 2018
\ACMCCS[Theory of computation]: Formal languages and automata theory — Tree languages; [Theory of computation]: Formal languages and automata — Automata extensions — Quantitative automata \amsclass68Q45, 68Q25
\titlecomment\lsuper
*This is a revised and extended version of [Maletti, Quernheim: Pushing for weighted tree automata. Proc. 36th Int. Conf. Mathematical Foundations of Computer Science, LNCS 6907, p. 460–471, 2011].
Pushing for weighted tree automata\rsuper*
— dedicated to the memory of Zoltán Ésik (1951–2016) —
Thomas Hanneforth
Universität Potsdam, Human Sciences Faculty, Department Linguistik Karl-Liebknecht-Str. 24–25, 14476 Potsdam, Germany
,
Andreas Maletti
Universität Leipzig, Faculty of Mathematics and Computer Science, Institute of Computer Science PO box 100 920, 04009 Leipzig, Germany
and
Daniel Quernheim
Universität Stuttgart, Institute for Natural Language Processing Pfaffenwaldring 5b, 70569 Stuttgart, Germany
Abstract.
A weight normalization procedure, commonly called pushing, is introduced for weighted tree automata (wta) over commutative semifields. The normalization preserves the recognized weighted tree language even for nondeterministic wta, but it is most useful for bottom-up deterministic wta, where it can be used for minimization and equivalence testing. In both applications a careful selection of the weights to be redistributed followed by normalization allows a reduction of the general problem to the corresponding problem for bottom-up deterministic unweighted tree automata. This approach was already successfully used by Mohri and Eisner for the minimization of deterministic weighted string automata. Moreover, the new equivalence test for two wta and runs in time \mathcal{O}\bigl{(}(\lvert M\rvert+\lvert M^{\prime}\rvert)\log{(\lvert Q\rvert+\lvert Q^{\prime}\rvert)}\bigr{)}, where and are the states of and , respectively, which improves the previously best run-time \mathcal{O}\bigl{(}\lvert M\rvert\cdot\lvert M^{\prime}\rvert\bigr{)}.
Key words and phrases:
pushing — weighted tree automaton — minimization — equivalence testing
Financially supported by the German Research Foundation (DFG) grant MA / 4959 / 1-1.
1. Introduction
Weighted tree automata [FV09] have recently found various applications in fields as diverse as natural language and XML processing [KM09], system verification [Jac11], and pattern recognition. Most applications require efficient algorithms for basic manipulations of tree automata such as determinization [BMV10], inference [MKV10], and minimization [HMM09, HMM07]. For example, in the system verification domain the properties to be verified are typically easily expressed as a formula in a logic. It is well-known [TW68] that tree automata are as expressive as monadic second-order logic with two successors. This celebrated result was recently generalized to the weighted setting for various weight structures [DV06, Man08, DGMM11, VDH16], so quantitative specifications are readily available. However, one of the main insights gained in the development of the mona toolkit [KM01] (or the spass system [WDF*+*09]) was that the transformation of a formula into an equivalent tree automaton heavily relies on the minimization of the constructed deterministic tree automata as the automata otherwise grow far too quickly. Similarly, a major inference setup, also used in the synthesis subfield in system verification, is Angluin’s minimally adequate teacher setup [Ang87]. In this setup, the learner is given access to an oracle that correctly supplies coefficients of trees in the weighted tree language to be learned, which are called coefficient queries, and certificates that the proposed weighted tree automaton indeed represents the weighted tree language to be learned, which are called equivalence queries. In implementations of the oracle the latter queries are typically answered by equivalence tests.
As already mentioned, quantitative models have recently enjoyed a lot of attention. For example, in natural language processing, weighted devices are often used to model probabilities, cost functions, or other features. In this contribution, we consider pushing [Moh97, Eis03] for weighted tree automata [BR82, FV09] over commutative semifields [HW98, Gol99]. Roughly speaking, pushing moves transition weights along a path. If the weights are properly selected, then pushing can be used to canonicalize a (bottom-up) deterministic weighted tree automaton [Bor05]. The obtained canonical representation has the benefit that it can be minimized using unweighted minimization, in which the weight is treated as a transition label. This strategy has successfully been employed in [Moh97, Eis03] for deterministic weighted (finite-state) string automata, and similar approaches have been used to minimize sequential transducers [Cho03] and bottom-up tree transducers [FSM11]. Here we adapt the strategy for tree automata. In particular, we improve the currently best minimization algorithm [Mal09] for a deterministic weighted tree automaton with states from \mathcal{O}\bigl{(}\lvert M\rvert\cdot\lvert Q\rvert\bigr{)} to \mathcal{O}\bigl{(}\lvert M\rvert\log{\lvert Q\rvert}\bigr{)}, which coincides with the complexity of minimization in the unweighted case [HMM09]. The improvement is achieved by a careful selection of the signs of life [Mal09]. Intuitively, a sign of life for a state is a context that takes into a final state. In [Mal09] the signs of life are computed by a straightforward exploration algorithm, which is very efficient, but does not guarantee that states that are later checked for equivalence receive the same sign of life. During the (pair-wise) equivalence checks in [Mal09] the evaluation of the weight of a state in the sign of life of another state thus becomes unavoidable, which causes the increased complexity. In this contribution, we precompute an equivalence relation, which, in general, is still coarser than the state equivalence to be determined, but equivalent states in this equivalence permit the same sign of life. Then we determine a sign of life for each equivalence class. Later we only refine this equivalence relation to obtain the state equivalence, so each state will only be evaluated in its sign of life and this evaluation can be precomputed. Moreover, the weights obtained in this evaluation, also called pushing weights, allow a proper canonicalization in the sense that equivalent states will have exactly the same weights on corresponding transitions after pushing. This property sets our algorithm apart from Algorithm 1 of [Mal09] and allows us to rely on unweighted minimization [HMM09]. Our pushing procedure, which is defined for general (potentially nondeterministic) weighted tree automata, always preserves the semantics, so it might also be useful in other setups.
Secondly, we apply pushing to the problem of testing equivalence. The currently fastest algorithm [DHM11] for checking equivalence of two deterministic weighted tree automata and runs in time \mathcal{O}\bigl{(}\lvert M\rvert\cdot\lvert M^{\prime}\rvert\bigr{)}. It is well known that two minimal deterministic weighted tree automata and are equivalent if and only if they can be obtained from each other by a pushing operation (with proper pushing weights). In other words, equivalent automata and have the same transition structure, but their transition weights can differ by consistent factors. We extend our approach to minimization also to equivalence testing, so we again carefully determine the pushing weight and the sign of life of each state of such that it shares the sign of life with all equivalent states of but also with all corresponding states in . This allows us to minimize both input automata and then treat the obtained automata as unweighted automata and test them for isomorphism. This approach reduces the run-time complexity to \mathcal{O}\bigl{(}(\lvert M\rvert+\lvert M^{\prime}\rvert)\log{(\lvert Q\rvert+\lvert Q^{\prime}\rvert)}\bigr{)}, where and are the states of and , respectively.
2. Preliminaries
We write for the set of all nonnegative integers and for its subset given . The -fold Cartesian product of a set is written as , and the empty tuple is often written as . Every finite and nonempty set is also called alphabet, of which the elements are called symbols. A ranked alphabet consists of an alphabet and a mapping , which assigns a rank to each symbol. If the ranking ‘’ is obvious from the context, then we simply write for the ranked alphabet. For each , we let be the set of -ary symbols of . Moreover, we let . The set of all -trees indexed by is inductively defined to be the smallest set such that and . Instead of we simply write . The size of a tree is inductively defined by for every and for every , , and . To increase readability, we often omit quantifications like “for all ” if they are obvious from the context.
We reserve the use of a special symbol that is not an element in any considered alphabet. Its function is to mark a designated position in certain trees called contexts. Formally, the set of all -contexts indexed by is defined as the smallest set such that and for every , , , and . As before, we simplify to . In simple words, a context is a tree, in which the special symbol occurs exactly once and at a leaf position. Note that , but , which allows us to treat contexts like trees. Given and , the tree is obtained from by replacing the unique occurrence of in by . In particular, given that .
A commutative semiring [HW98, Gol99] is a tuple such that and are commutative monoids and and for all (i.e., distributes over ). It is a commutative semifield if is a commutative group (i.e., in addition, for every there exists such that ). Typical commutative semifields include
- •
the Boolean semifield ,
- •
the field of rational numbers, and
- •
the Viterbi semifield , where .
Given a mapping , we write for the set of elements that are mapped via to a non-zero semiring element.
*For the rest of the paper, let be a commutative semifield.*111Clearly, weighted tree automata can also be defined for semirings or even more general weight structures, but already minimization for deterministic finite-state string automata becomes NP-hard for simple semirings that are not semifields (see [Eis03, Section 3]).
A weighted tree automaton [BLB83, Boz99, Kui98, BV03, Bor05, FV09] (for short: wta) is a tuple , in which
- •
is an alphabet of states,
- •
is a ranked alphabet of input symbols,
- •
assigns a weight to each transition, and
- •
is a set of final states.
We often write elements of as instead of . The size of the wta is
[TABLE]
We extend the transition weight assignment to a mapping by
[TABLE]
for all , , and . The wta recognizes the weighted tree language such that for every . Two wta and are equivalent if their recognized weighted tree languages coincide. The unweighted (finite-state) tree automaton [GS84, GS97, CDG*+*07] (for short: fta) corresponding to is .222An fta computes in the same manner as a wta over the Boolean semifield . We note that , where is the tree language recognized by the fta .
The wta is (bottom-up) deterministic (or a dwta) if for every there exists at most one such that . In other words, a wta is deterministic if and only if is bottom-up deterministic. In a dwta we can (without loss of information) treat and as partial mappings and . We use and as well as and for the corresponding projections to the first and second output component, respectively (e.g., and ). To avoid complicated distinctions, we treat undefinedness like a value (i.e., it is equal to itself, but different from every other value). We observe that for each dwta .333The statement holds because each commutative semifield is zero-divisor free [Bor03, Lemma 1]. Moreover, the restriction to final states instead of final weights in the definition of a wta does not restrict the expressive power [Bor05, Lemma 6.1.4], which applies to both wta and dwta. In addition, the transformation of a wta with final weights into an equivalent wta with final states does not add additional states, so all our results also apply to wta with final weights.
An equivalence relation on a set is a reflexive, symmetric, and transitive subset of . The equivalence class (or block) of the element is , and we let for every . Whenever is obvious from the context, we simply omit it. The equivalence respects a set if or for every (i.e., each equivalence class is either completely in or completely outside ).
Let be a dwta. An equivalence relation is a congruence (of ) if for every and all equivalent states . Note that this definition of congruence completely disregards the weights, which yields that is a congruence for if and only if is a congruence for . Two states are weakly equivalent, written as , if if and only if for all contexts . In other words, weak equivalence coincides with classical equivalence [GS84, Definition II.6.8] for . Consequently, the weak equivalence relation is actually a congruence of that respects [GS84, Theorem II.6.10]. The weak equivalence relation can be computed in time \mathcal{O}\bigl{(}\lvert M\rvert\log{\lvert Q\rvert}\bigr{)} [HMM09]. Finally, two states are (strongly) equivalent, written as if there exists a factor such that for all we have
[TABLE]
where is the characteristic function of ; i.e., if and only if for all . The equivalence relation is called the Myhill-Nerode equivalence relation [Mal09, Definition 3]. It is also a congruence that respects [Mal09, Lemma 4]. If is clear from the context, then we just write instead of .
3. Signs of life
First, we demonstrate how to efficiently compute signs of life (Definition 3), which are evidence that a final state can be reached. Together with these signs of life we also compute a pushing weight for each state (Section 4). Our Algorithm 1 is a straightforward extension of [Mal09, Algorithm 1] that computes on equivalence classes of states (with respect to a congruence that respects finality) instead of states.444Note that our algorithm is not simply the previous algorithm executed on the quotient dwta with respect to the congruence. The original dwta is used essentially in the computation of the pushing weights. This change guarantees that equivalent states receive the same sign of life, which is an essential requirement for the algorithms in Sections 5 and 6.
Before we start we need to recall the definition of a sign of life [Mal09]. In addition, we recall the relevant properties that we use in our algorithm. For the rest of this section, let be a dwta.
{defi}
[[Mal09, Section 2]]
A context is a sign of life for the state if . Any state that has a sign of life is live; otherwise it is dead.
The following lemma justifies that we can compute signs of life for equivalence classes of congruences that respect instead of individual states since all states of such an equivalence class share the same signs of life.
Lemma 1** (see [Mal09, Lemma 9]).**
We have for every congruence that respects . In particular, . Moreover, every sign of life for is also a sign of life for every .
Proof 3.1**.**
It is known that is the coarsest congruence that respects [GS84, Theorem II.6.10].555Mind that coincides with classical equivalence on and that our notion of congruence completely disregards the weights. Consequently, and since we already remarked that is also a congruence that respects . Based on the definition of it is trivial to see that all elements of an equivalence class of share the same signs of life [Mal09, Lemma 9]. Since we obtain the desired statement.
Algorithm 1 simply attempts to reach all states from the final states computing a context that takes the state to a final state (i.e., a sign of life) as well as its weight in the process. Due to Lemma 1 the signs of life are computed for equivalence classes (or blocks) instead of individual states. Now let us explain Algorithm 1 in detail. Every final state is trivially live as evidenced by the trivial sign of life . Since the congruence respects , the set contains equivalence classes that contain only final states. We set the sign of life for each class to [see Line 3], and for each involved state we set its pushing weight to [see Line 4]. Overall, this initialization takes time \mathcal{O}\bigl{(}\lvert F\rvert\bigr{)}. Next, we add all those blocks to the live states and to the blocks yet to be explored. As long as there are still unexplored blocks, we select a block from and remove it from . Then we consider all transitions that end in a state that belongs to the block and check whether it contains a source state that is not yet present in . For each such source state , we add its equivalence class to both and . Then we set the sign of life for this class to the sign of life for extended by the considered transition [see Line 12]. Finally, we select each state from and compute a pushing weight by multiplying the weight of the currently considered transition with replaced by to the already computed pushing weight for the target state reached by the modified transition [see Line 13].
Theorem 2**.**
Algorithm 1 is correct and runs in time \mathcal{O}\bigl{(}\lvert M\rvert+\lvert Q\rvert\bigr{)}.
Proof 3.2**.**
Since our algorithm is similar to the one of [Mal09], our proof closely resembles the proofs of [Mal09, Lemma 10 and Theorem 11] adjusted to equivalence classes. We already argued that the initialization runs in time \mathcal{O}\bigl{(}\lvert F\rvert\bigr{)}\subseteq\mathcal{O}\bigl{(}\lvert Q\rvert\bigr{)}. It is easy to see that at all times in the main loop [Line 6–13] of the algorithm. Consequently, each block can be added at most once to since it is added at the same time to and only blocks not in can be added to . This yields that the main loop executes at most times. The inner loop [Line 9–13] can execute at most times since each transition is considered at most once in the middle loop and at most once for each source state of the transition. The statements in the inner loop all execute in constant time except for Line 13, which can be executed once for each state . Overall, we thus obtain the running time \mathcal{O}\bigl{(}\lvert M\rvert+\lvert Q\rvert\bigr{)}.
Now let us prove the post-conditions. By Lemma 1 we know that signs of life are shared between elements in an equivalence class of . The remaining statements are proved by induction along the outer main loop. Initially, we set
[TABLE]
by Lines 3–4, which proves the post-condition because . In the main loop, we set in Line 13. The equivalence class of has already been explored in a previous iteration because , which by the congruence property yields and the latter was in the explored equivalence class , which in turn yields that the former is in . Consequently, we can employ the induction hypothesis and obtain . In addition,
[TABLE]
which proves the post-condition because by Line 12. Clearly, is a sign of life for , which proves that is live. Finally, suppose that there is a live state such that (i.e., we assume a live state that is not classified as such by Algorithm 1). Since it is live, it has a sign of life . By induction on we can prove that, when processing , there exists a transition that uses a source state such that , whereas the target state is such that .666Such a switch must exist because all the final states are represented in . However, since was explored, the considered transition was considered in the algorithm, which means that the equivalence class was added to . This contradicts the assumption, which shows that all states that are not represented in are indeed dead.
{exa}
Our example dwta is depicted left in Figure 1. For any transition (small circle, the annotation specifies the input symbol and the weight separated by a colon), the arrow leads to the target state and the source states have been arranged in a counter-clockwise fashion starting from the target arrow. For example, the bottom center transition labeled in the left dwta of Figure 1 corresponds to ; i.e., its target state is , its symbol is , its source states are (in this order), and its weight is . As usual, final states are doubly circled. The graphical representation of wta is explained in detail in [Bor05]. The coarsest congruence respecting is represented by the set of equivalence classes (i.e., partition). We use this congruence in Algorithm 1. First, the block of final states is marked as live and added to . It is assigned the trivial context as sign of life and each final state is assigned the trivial weight . Clearly, we can only select one equivalence class in the main loop. Let us consider the transition , whose target state is in . Since has not yet been marked as live, we add it to both and . In addition, we set its sign of life to . Finally, we set the pushing weights to and . Now all states are live, so the loops will terminate. Consequently, we have computed all signs of life and the pushing weights
[TABLE]
4. Pushing
The Myhill-Nerode congruence requires that there is a unique scaling factor for every pair of equivalent states. Thus, any fixed sign of life for both and [for which ] yields non-zero weights and , which can be used to determine this unique scaling factor between and . In fact, we already computed those weights and in Algorithm 1. By Lemma 1, states that are not weakly equivalent (and thus might not have the same sign of life after executing Algorithm 1 with ) also cannot be equivalent. For the remaining pairs of live states, we computed a sign of life for the equivalence class of in the previous section. In addition, we computed pushing weights and . Now, we will use these weights to normalize the wta by pushing [Moh97, Eis03, PG09]. Intuitively, pushing cancels the scaling factor for equivalent states, which we will prove in the next section. In general, it just redistributes weights along the transitions. In weighted (finite-state) string automata [Sak09], pushing is performed from the final states towards the initial states [Moh97]. Since we work with bottom-up wta [Bor05] (i.e., our notion of determinism is bottom-up), this works analogously here by moving weights from the root towards the leaves. However, we introduce our notion of pushing for arbitrary, not necessarily deterministic wta. To this end, we lift the corresponding definition [Moh97, page 296] from string to tree automata.
In this section, let be an arbitrary wta and be an arbitrary mapping such that for every .
{defi}
The pushed wta is such that for every and
[TABLE]
The mapping indicates the pushed weights. It is non-zero everywhere and has to be for final states because our model does not have final weights.777As already mentioned, the restriction to final states is a convenience and not an essential restriction. In the pushed wta , the weight of every transition leading to the state is obtained from the weight of the corresponding transition in by multiplying the weight . To compensate, the weight of every transition leaving the state will cancel the weight by multiplying with . Thus, we expect an equivalent wta after pushing, which we confirm by showing that and are indeed equivalent. The corresponding statement for string automata is [Moh97, Lemma 4].
Proposition 3**.**
The wta and are equivalent. Moreover, if is deterministic, then so is .
Proof 4.1**.**
Let . The preservation of determinism is obvious because .888In fact, because semifields are zero-divisor free [Bor03, Lemma 1]. We prove that for every and by induction on . Let for some and . By the induction hypothesis, we have for every and . Consequently,
[TABLE]
We complete the proof as follows.
[TABLE]
because for every .
Theorem 4**.**
The wta is equivalent to and can be obtained in time \mathcal{O}\bigl{(}\lvert M\rvert\bigr{)}.
{exa}
Let us return to our example dwta left in Figure 1 and perform pushing. The pushing weights are given in Example 3. We consider the transition , which has weight in . In this transition has the weight
[TABLE]
The dwta is presented right in Figure 1. With a little effort, we can confirm that and are equivalent in , whereas and are not.
5. Minimization
Our main application of weight pushing is efficient dwta minimization, which we present next. The overall structure of our minimization procedure is presented in Algorithm 2. As mentioned earlier, the coarsest congruence for a dwta that respects can be obtained by minimization [HMM09] of . We call this procedure ComputeCoarsestCongruence and supply it with a dwta and an equivalence relation. It returns the coarsest congruence (of ) that refines the given equivalence relation.
Let be a dwta (without useless states) and be the pushing weights computed by Algorithm 1 when run on and .999In a dwta without useless states we have . In addition, we let .
The dwta has the property that for all and states for every . We will prove this property (1) in Lemma 5. It is this property, which, in analogy to the string case [Moh97, Eis03], allows us to compute the equivalence on an unweighted fta , in which we treat the transition weight as part of the input symbol. For example, the algorithm of [HMM09] can then be used to compute . Finally, we merge the equivalent states using the information about the scaling factors contained in the pushing weights in the same way as in [Mal09]. Let us start with the formal definitions.101010We avoid a change of the weight structure from our semifield to the Boolean semifield since the multiplicative submonoid induced by is isomorphic to the multiplicative monoid of . Thus, our dwta with weights in compute in the same manner as a dwta over or equivalently a deterministic fta.
{defi}
Let be a dwta, and let be the finite set of non-zero weights that occur as transition weights in . The alphabetic dwta for is , where
- •
for every and ,
- •
for every , and
- •
for every , , and
[TABLE]
Clearly, the construction of can be performed in time \mathcal{O}\bigl{(}\lvert M\rvert\bigr{)}. Next, we show that the equivalence in coincides with the equivalence in , where . We achieve this proof by showing both inclusions.
Lemma 5**.**
The congruence of is a congruence of that respects .
Proof 5.1**.**
Let . Since and have the same final states , trivially respects because it is a congruence of that respects . Naturally, is an equivalence, so it remains to prove the congruence property for . Let and for every . Then
[TABLE]
because is a congruence of . For the moment, let us assume that111111Mind that we compare the weights in here.
[TABLE]
then
[TABLE]
For all the remaining combinations of we have that both and are undefined and thus equal. We have thus proved the congruence property given the assumption. Consequently, it remains to show that the assumption
[TABLE]
is true. By Definition 4, we have
[TABLE]
Now we prove that
[TABLE]
for every , where . Let and . Since , we also have that because is a congruence of . This yields that by Lemma 1. Let be a sign of life for both and . Moreover, we have a constant scaling factor between the equivalent states and , which yields
[TABLE]
where holds because is a sign of life for both and and holds essentially by definition. With these equations, let us inspect the main equality.
[TABLE]
Now we are ready to return to the proof obligation expressed in (1). We apply (4) in total times to obtain the desired statement.
[TABLE]
which completes the proof.
Theorem 6**.**
We have , where .
Proof 5.2**.**
Lemma 5 shows that is a congruence of that respects . Since is the coarsest congruence of that respects by [GS84, Theorem II.6.10], we obtain that . The converse is simple to prove as states that are weakly equivalent in share exactly the same signs of life with the scaling factor . Since the signs of life already indicate the transition weights, we immediately obtain that such weakly equivalent states in have corresponding transitions with equal transition weights in , which proves that those states are also equivalent in with the scaling factor . The latter statement can then be used to prove that they are also equivalent in (with a scaling factor that is potentially different from ).
The currently fastest dwta minimization algorithm [Mal09] runs in time \mathcal{O}\bigl{(}\lvert M\rvert\cdot\lvert Q\rvert\bigr{)}. Our approach, which relies on pushing and is presented in Algorithm 2, achieves the same run-time \mathcal{O}\bigl{(}\lvert M\rvert\log{\lvert Q\rvert}\bigr{)} as the fastest minimization algorithms for deterministic fta.
Corollary 7** (see Algorithm 2).**
For every dwta , we can compute an equivalent minimal dwta in time \mathcal{O}\bigl{(}\lvert M\rvert\log{\lvert Q\rvert}\bigr{)}.
6. Testing equivalence
In this final section, we want to decide whether two given dwta are equivalent. To this end, let and be dwta. The overall approach is presented in Alg. 3. First, we compute a correspondence between states. For every , we compute a tree , which is also called access tree for , such that . If no access tree exists, then is not reachable and can be deleted. A dwta, in which all states are reachable, is called accessible. To avoid these details, let us assume that and are accessible, which can always be achieved in time \mathcal{O}\bigl{(}\lvert M\rvert+\lvert M^{\prime}\rvert\bigr{)}. In this case, we can compute an access tree for every state in time \mathcal{O}\bigl{(}\lvert M\rvert\bigr{)} using standard breadth-first search, in which we unfold each state (i.e., explore all transitions leading to it) at most once. To keep the representation efficient, we store the access trees in the format , where each state refers to its access tree . To obtain the corresponding state , we compute the state of that is reached when processing the access tree . Formally, for every . This computation can also be achieved in time \mathcal{O}\bigl{(}\lvert M\rvert\bigr{)} since we can reuse the results for the subtrees. Consequently, we have that and for every . Clearly, the computation of the access trees and the correspondence can be performed in time \mathcal{O}\bigl{(}\lvert M\rvert\bigr{)}. Next, we compute the coarsest congruences and for and that respect and , respectively, and the signs of life for .
Lemma 8**.**
Let and be equivalent. The correspondence is compatible with the congruences and ; i.e., if and only if for all . Moreover, for every reachable there exists such that . Consequently, induces a bijection on the equivalence classes.
Proof 6.1**.**
Let , and let and be the corresponding access trees. Then
[TABLE]
where follows from [Mal08, Lemma 4] and follows from the easy fact that if and only if for all and such that .
For the second statement, let be a reachable state, and let be such that . Clearly, we have
[TABLE]
where . Consequently, using we obtain .
We just demonstrated that for equivalent dwta the correspondence always yields a bijection . We can test the compatibility in time \mathcal{O}\bigl{(}\lvert Q\rvert+\lvert Q^{\prime}\rvert\bigr{)}. Next we transfer the signs of life via to the equivalence classes of and calculate the corresponding pushing weights for all states . Since the signs of life can contain states of , we need to rename them using the correspondence , so we use the function , which is defined by , for all , and for all and trees . We note that for all .
Using this approach corresponding equivalence classes receive the same sign of life (modulo the renaming of the states). We then minimize and using the method of Section 5 (i.e., we perform pushing followed by unweighted minimization). Finally, we test the obtained deterministic fta for isomorphism.
Lemma 9**.**
We use the symbols of Algorithm 3. Given a compatible correspondence , the dwta and are equivalent if and only if the deterministic unweighted fta and are equivalent.
Proof 6.2**.**
Clearly, if the deterministic fta and are equivalent, then also and are equivalent since the weights are annotated on the symbols of the former devices. Moreover, since pushing preserves the semantics (see Proposition 3), also the dwta and are equivalent, which concludes one direction. For the other direction, let and be equivalent. Then also and are equivalent due to Proposition 3. An easy adaptation of the proof (of the equality (1) of the transition weights) of Lemma 5 can be used to show that the transition weights of corresponding transitions are equal and hence and are equivalent.
Lemma 9 proves the correctness of Algorithm 3 because the minimal deterministic fta for a given tree language is unique (up to isomorphism) [GS84, Theorem 2.11.12]. The run-time of our algorithm should be compared to the previously (asymptotically) fastest equivalence test for dwta of [DHM11], which runs in time \mathcal{O}\bigl{(}\lvert M\rvert\cdot\lvert M^{\prime}\rvert\bigr{)}.
Theorem 10**.**
We can test equivalence of and in time \mathcal{O}\bigl{(}(\lvert M\rvert+\lvert M^{\prime}\rvert)\log{(\lvert Q\rvert+\lvert Q^{\prime}\rvert)}\bigr{)}.
Acknowledgments
The authors gratefully acknowledge the insight and suggestions provided by the reviewers of the conference and the current version.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[Ang 87] Dana Angluin. Learning regular sets from queries and counterexamples. Information and Computation , 75(2):87–106, 1987.
- 2[BLB 83] Symeon Bozapalidis and Olympia Louscou-Bozapalidou. The rank of a formal tree power series. Theoretical Computer Science , 27(1–2):211–215, 1983.
- 3[BMV 10] Matthias Büchse, Jonathan May, and Heiko Vogler. Determinization of weighted tree automata using factorizations. Journal of Automata, Languages and Combinatorics , 15(3–4):229–254, 2010.
- 4[Bor 03] Björn Borchardt. The Myhill-Nerode theorem for recognizable tree series. In Proc. 7th Int. Conf. Developments in Language Theory , volume 2710 of Lecture Notes in Computer Science , pages 146–158. Springer, 2003.
- 5[Bor 05] Björn Borchardt. The Theory of Recognizable Tree Series . Ph D thesis, TU Dresden, 2005.
- 6[Boz 99] Symeon Bozapalidis. Equational elements in additive algebras. Theory of Computing Systems , 32(1):1–33, 1999.
- 7[BR 82] Jean Berstel and Christophe Reutenauer. Recognizable formal power series on trees. Theoretical Computer Science , 18(2):115–148, 1982.
- 8[BV 03] Björn Borchardt and Heiko Vogler. Determinization of finite state weighted tree automata. Journal of Automata, Languages and Combinatorics , 8(3):417–463, 2003.
