Simulations in Rank-Based B\"uchi Automata Complementation (Technical Report)
Yu-Fang Chen, Vojt\v{e}ch Havlena, Ond\v{r}ej Leng\'al

TL;DR
This paper introduces simulation-based techniques to optimize the size of automata produced by rank-based B"uchi automata complementation, making the process more efficient in practice.
Contribution
It proposes novel methods using simulation relations to reduce automaton size in rank-based complementation, improving practical efficiency.
Findings
Techniques significantly reduce automaton size in experiments.
Methods effectively ignore non-contributing macrostates.
Saturation with simulation-smaller states decreases macrostate count.
Abstract
Complementation of B\"uchi automata is an essential technique used in some approaches for termination analysis of programs. The long search for an optimal complementation construction climaxed with the work of Schewe, who proposed a worst-case optimal rank-based procedure that generates complements of a size matching the theoretical lower bound of , modulo a polynomial factor of . Although worst-case optimal, the procedure in many cases produces automata that are unnecessarily large. In this paper, we propose several ways of how to use the direct and delayed simulation relations to reduce the size of the automaton obtained in the rank-based complementation procedure. Our techniques are based on either (i) ignoring macrostates that cannot be used for accepting a word in the complement or (ii) saturating macrostates with simulation-smaller states, in order to decrease…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topicssemigroups and automata theory · Formal Methods in Verification · Logic, programming, and type systems
11institutetext: Academia Sinica, Taiwan 22institutetext: FIT, IT4I Centre of Excellence, Brno University of Technology, Czech Republic
Simulations in Rank-Based
Büchi Automata Complementation
(Technical Report)
Yu-Fang Chen and Vojtěch Havlena and Ondřej Lengál 112222
Abstract
Complementation of Büchi automata is an essential technique used in some approaches for termination analysis of programs. The long search for an optimal complementation construction climaxed with the work of Schewe, who proposed a worst-case optimal rank-based procedure that generates complements of a size matching the theoretical lower bound of , modulo a polynomial factor of . Although worst-case optimal, the procedure in many cases produces automata that are unnecessarily large. In this paper, we propose several ways of how to use the direct and delayed simulation relations to reduce the size of the automaton obtained in the rank-based complementation procedure. Our techniques are based on either (i) ignoring macrostates that cannot be used for accepting a word in the complement or (ii) saturating macrostates with simulation-smaller states, in order to decrease their total number. We experimentally showed that our techniques can indeed considerably decrease the size of the output of the complementation.
1 Introduction
Büchi automata (BA) complementation is a fundamental problem in program analysis and formal verification, from both theoretical and practical angles. It is, for instance, a critical step in some approaches for termination analysis, which is an essential part of establishing total correctness of programs [14, 19, 9]. Moreover, BA complementation is used as a component of decision procedures of some logics for reasoning about programs, such as S1S capturing a decidable fragment of second-order arithmetic [6] or the temporal logics ETL and QPTL [35].
The study of the BA complementation problem can be traced back to 1962, when Büchi introduced his automaton model in the seminal paper [6] in the context of a decision procedure for the S1S fragment of second-order arithmetic. In the paper, a doubly exponential complementation algorithm based on the infinite Ramsey theorem is proposed. In 1988, Safra [32] introduced a complementation procedure with an upper bound and, in the same year, Michel [28] established an lower bound. From the traditional theoretical point of view, the problem was already solved, since exponents in the two bounds matched under the notation (recall that is approximately ). From a more practical point of view, a linear factor in an exponent has a significant impact on real-world applications. It was established that the upper bound of Safra’s construction is , so the hunt for an optimal algorithm continued [38]. A series of research efforts participated in narrowing the gap [24, 15, 39, 23, 41]. The long journey climaxed with the result of Schewe [33], who proposed an optimal rank-based procedure that generates complements of a size matching the theoretical lower bound of found by Yan [41], modulo a polynomial factor of .
Although the algorithm of Schewe is worst-case optimal, it often generates unnecessarily large complements. The standard approach to alleviate this problem is to decrease the size of the input BA before the complementation starts. Since minimization of (nondeterministic) BAs is a PSpace-complete problem, more lightweight reduction methods are necessary. The most prevalent approaches are those based on various notions of simulation-based reduction, such as reductions based on direct simulation [7, 36], a richer delayed simulation [12], or their multi-pebble variants [13]. These approaches first compute a simulation relation over the input BA—which can be done with the time complexity [20, 22, 30, 31, 8] and [12] for direct and delayed simulation respectively, with the number of states and transitions —and then construct a quotient BA by merging simulation-equivalent states, while preserving the language of the input BA. The other approach is a reduction based on fair simulation [18]. The fair simulation cannot, however, be used for quotienting, but still it can be used for merging certain states and removing transitions. The reduced BA is used as the input of the complementation, which often significantly reduces the size of the result.
In this paper, we propose several ways of how to exploit the direct and delayed simulations in BA complementation even further to obtain smaller complements and shorter running times. We focus, in particular, on the optimal rank-based complementation procedure of Schewe [33]. Essentially, the rank-based construction is an extension of traditional subset construction for determinizing finite automata, with some additional information kept in each macrostate (a state in the complemented BA) to track the acceptance condition of all runs of the input automaton on a given word. In particular, it stores the rank of each state in a macrostate, which, informally, measures the distance to the last accepting state on the corresponding run in the input BA. The main contributions of this paper are the following optimisations of rank-based complementation for BAs, for an input BA and the output of the rank-based complementation algorithm .
Purging: We use simulation relations over to remove some useless macrostates during the construction of . In particular, if a state is simulated by in , this puts a restriction on the relation between the ranks of runs from and from . As a consequence, macrostates that assign ranks violating this restriction can be purged from . 2. 2.
Saturation: We saturate macrostates with states that are simulated by the macrostate; this can reduce the total number of states of because two or more macrostates can be mapped to a single saturated macrostate. This is inspired by the technique of Glabbeek and Ploeger that uses closures in finite automata determinization [17].
The proposed optimizations are orthogonal to simulation-based size reduction mentioned above. Since the quotienting methods are based on taking only the symmetric fragment of the simulation, i.e., they merge states that simulate each other, after the quotienting, there might still be many pairs where the simulation holds in only one way, and can therefore be exploited by our techniques. Since the considered notions of simulation-based quotienting preserve the respective simulations, our techniques can be used to optimize the complementation at no additional cost. Our experimental evaluation of the optimizations showed that in many cases, they indeed significantly reduce the size of the complemented BA.
2 Preliminaries
We fix a finite nonempty alphabet and the first infinite ordinal . For , by we denote the set . An (infinite) word is represented as a function where the -th symbol is denoted as . A finite word of length is represented as a function . The finite word of length [math] is denoted as . We abuse notation and sometimes also represent as an infinite sequence and as a finite sequence . The suffix of is denoted by . We use to denote the set of all infinite words over and to denote the set of all finite words. For we define and (note that ). Given , we use to denote the set .
A (nondeterministic) Büchi automaton (BA) over is a quadruple where is a finite set of states, is a transition function , and are the sets of initial and accepting states respectively. We sometimes treat as a set of transitions , for instance, we use to denote that . Moreover, we extend to sets of states as . A run of from on an input word is an infinite sequence that starts in and respects , i.e., and . We say that is accepting iff it contains infinitely many occurrences of some accepting state, i.e., . A word is accepted by from a state if there is an accepting run of from , i.e., . The set is called the language of (in ). Given a set of states , we define the language of as and the language of as . For a pair of states and in , we use to denote .
Without loss of generality, in this paper, we assume to be complete, i.e., for every state and symbol , it holds that . A trace over a word is an infinite sequence such that is a run of over from . We say is fair if it contains infinitely many accepting states. Moreover, we use for to denote that is reachable from over the word ; if a path from to over contains an accepting state, we can write . In this paper, we fix a complete BA .
2.1 Simulations
We introduce simulation relations between states of a BA using the game semantics in a similar manner as in the extensive study of Clemente and Mayr [26]. In particular, in a simulation game between two players (called Spoiler and Duplicator) in from a pair of states , for any (infinite) trace over a word that Spoiler takes starting from , Duplicator tries to mimic the trace starting from . On the other hand, Spoiler tries to find a trace that Duplicator cannot mimic. The game starts in the configuration and every -th round proceeds by, first, Spoiler choosing a transition and, second, Duplicator mimicking Spoiler by choosing a matching transition over the same symbol . The next game configuration is . Suppose that and are the two (infinite) traces constructed during the game. Duplicator wins the simulation game if holds, where is a condition that depends on the particular simulation. In the current paper, we consider the following simulation relations:
- •
direct [11]:
- •
delayed [12]: and
- •
fair [21]: if is fair, then is fair.
A maximal -simulation relation , for , is defined such that iff Duplicator has a winning strategy in the simulation game with the winning condition starting from . Formally, we define a strategy to be a (total) mapping such that , i.e., if Duplicator is in state and Spoiler selects a transition , the strategy picks a state such that (and because is complete, such a transition always exists). Note that Duplicator cannot look ahead at Spoiler’s future moves. We use to denote any winning strategy of Duplicator in the simulation game. Let and be a pair of winning strategies in the simulation game. We say that is dominated by if for all states and all transitions it holds that , and that is strictly dominated by if is dominated by and does not dominate . A strategy is dominating if it is not strictly dominated by any other strategy. Strategies are also lifted to traces as follows: let be as above, then where for all it holds that . The considered simulation relations form the following hierarchy: . Note that every maximal simulation relation is a preorder, i.e., reflexive and transitive.
2.2 Run DAGs
In this section, we recall the terminology from [33] (which is a minor modification of the terminology from [24]). We fix the definition of the run DAG of over a word to be a DAG (directed acyclic graph) of vertices and edges where
- •
s.t. iff there is a run of over with ,
- •
s.t. iff and .
Given as above, we will write to denote that . We call accepting if is an accepting state. is rejecting if it contains no path with infinitely many accepting vertices. A vertex is finite if the set of vertices reachable from is finite, infinite if it is not finite, and endangered if cannot reach an accepting vertex.
We assign ranks to vertices of run DAGs as follows: Let and . Repeat the following steps until the fixpoint or for at most steps, where is the number of states of .
- •
Set for all finite vertices of and let be minus the vertices with the rank .
- •
Set for all endangered vertices of and let be minus the vertices with the rank .
- •
Set .
For all vertices that have not been assigned a rank yet, we assign . (Note that since is complete, then .)
Lemma 1
If , then for all . Moreover, if , then there is a vertex s.t. .
Proof
Follows from Corollary 3.3 in [24]. ∎
3 Complementing Büchi Automata
We use as the starting point the complementation procedure of Schewe [33, Section 3.1], which we denote as Comp (the ‘S’ stands for ‘Schewe’). The procedure works with the notion of level rankings. Given , a (level) ranking is a function such that , i.e., assigns even ranks to accepting states of .111Note that our basic definitions slightly differs from the ones in Section 2.3 of [33]. This is because of a typo in [33]; indeed, if the procedure from [33] is implemented as is, the output does not accept the complement (there might be a macrostate where contains accepting states and is empty, and, therefore, the whole macrostate is accepting, which is wrong).
For a ranking , the rank of is defined as . For a set of states , we call to be -tight if (i) it has an odd rank , (ii) , and (iii) . A ranking is tight if it is -tight; we use to denote the set of all tight rankings. For a pair of rankings and , a set , and a symbol , we use iff for every and it holds that .
The Comp procedure constructs the BA whose components are defined as follows:
- •
where
- –
and
- –
Q_{2}=\begin{array}[t]{ll}\{(S,O,f,i)\in&2^{Q}\times 2^{Q}\times\mathcal{T}\times\{0,2,\ldots,2n-2\}\mid{}\\ &f\text{ is S-tight},O\subseteq S\cap f^{-1}(i)\},\end{array}
- •
,
- •
where
- –
such that ,
- –
such that , and
- –
such that iff , , is -tight, and
and if or
- *
and if , and
- •
.
Intuitively, Comp is an extension of the classical subset construction for determinization of finite automata. In particular, , and constitute the deterministic finite automaton obtained from using the subset construction. The automaton can, however, nondeterministically guess a point at which it will make a transition to a macrostate in the part; this guess corresponds to a level in the run DAG of the accepted word from which the ranks of all levels form an -tight ranking, where the component of the macrostate is again a subset from the subset construction. In the part, makes sure that in order for a word to be accepted by , all runs of over the word need to touch an accepting state only finitely many times. This is ensured by the component, which, roughly speaking, maps states to ranks of corresponding vertices in the run DAG over the given word. The component is used for a standard cut-point construction, and is used to make sure that all runs that have reached an accepting state in will eventually leave it (this can happen for different runs at a different point). The , and components were already present in [24]. The component was introduced by Schewe to improve the complexity of the construction; it is used to cycle over phases, where in each phase we focus on cut-points of a different rank. See [33] for a more elaborate exposition.
Proposition 1 (Corollary 3.3 in [33])
.
4 Purging Macrostates with Incompatible Rankings
Our first optimisation is based on removing from macrostates whose level ranking assigns some states of an unnecessarily high rank. Intuitively, when contains a state and a state such that is (directly) simulated by , i.e. , then needs to be at most . This is because in any word and its run DAG in , if and are at the same level of , then the ranks of their vertices and at the given level are either both (when ), or such that otherwise. This is because, intuitively, the DAG rooted in in is isomorphic to a subgraph of the DAG rooted in .
Formally, consider the following predicate on macrostates of :
[TABLE]
We modify Comp to purge macrostates that satisfy . That is, we create a new procedure Purgedi obtained from Comp by modifying the definition of such that all occurrences of are substituted by and
[TABLE]
We denote the BA obtained from Purgedi as . The following lemma, proved in Section 4.1 states the correctness of this construction.
Lemma 2 ()
**
The following natural question arises: Is it possible to extend the purging technique from direct simulation to other notions of simulation? For fair simulation, this cannot be done. The reason is that, for a pair of states and s.t. , it can happen that for a word , there can be a trace from over that finitely many times touches an accepting state (i.e., a vertex of in the corresponding run DAG can have any rank between [math] and ), while all traces from over can completely avoid touching any accepting state. From the point of view of fair simulation, these are both unfair traces, and, therefore, disregarded.
On the other hand, delayed simulation—which is often much richer than direct simulation—can be used, with a small change. Intuitively, the delayed simulation can be used because guarantees that on every level of trees in rooted in and in respectively, the rank of the vertex is at most by one larger than the rank of vertex (or by any number smaller). Formally, let be the following predicate on macrostates of :
[TABLE]
where for denotes the smallest even number greater or equal to and . Similarly as above, we create a new procedure, called Purgede, which is obtained from Comp by modifying the definition of such that all occurrences of are substituted by and
[TABLE]
We denote the BA obtained from Purgede as .
Lemma 3 ()
**
The use of in results in the fact that the two purging techniques are incomparable. For instance, consider a macrostate such that and . Then the macrostate will be purged in Purgedi, but not in Purgede.
The two techniques can, however, be easily combined into a third procedure Purgedi+de, when is substituted in Comp with defined as
[TABLE]
We denote the resulting BA as .
Lemma 4 ()
**
4.1 Proofs of Lemmas 2, 3, and 4
We first give a lemma that an -strategy preserves an -simulation .
Lemma 5
Let be an -simulation (for ). Then, the following holds: .
Proof
Let such that and , and let be a trace starting from with the first transition . From the definition of -simulation, there is a winning Duplicator strategy ; let and let be the first transition of . Let and be traces obtained from and by removing their first transitions. It is easy to see that if then also for any . It follows that is also a winning Duplicator strategy from . ∎
Next, we focus on delayed simulation and the proof of Lemma 3. In the next lemma, we show that if there is a pair of vertices on some level of the run DAG where one vertex delay-simulates the other one, there exists a relation between their rankings. This will be used to purge some useless rankings from the complemented BA.
Lemma 6
Let such that and be the run DAG of over . For all , it holds that .
Proof
Consider some and . First, suppose that . Since the rank can be at most , it will always hold that .
On the other hand, suppose that is finite, i.e., is not accepted by . Then, due to Lemma 1, . Because , it holds that is also not accepted by , and therefore also . We now need to show that .
Let be the sequence of run DAGs obtained from in the ranking procedure from Section 2.2. In the following text we use the abbreviation for . Since the rank of a node is given as the number s.t. , we will finish the proof of this lemma by proving the following claim:
Claim
Let and be s.t. and . Then .
Proof: We prove the claim by induction on .
- •
Base case: () Since we assume is complete, no vertex in is finite.
() We prove that if is endangered in , then is endangered in as well (so both would be removed in ). For the sake of contradiction, assume that is endangered in and is not. Therefore, since contains no finite vertices, there is an infinite path from s.t. contains at least one accepting state. In the following, we abuse notation and, given a strategy and a state , use to denote the path such that and , it holds that where for every . Since , there is a corresponding infinite path that also contains at least one accepting state. Therefore, is not endangered, a contradiction to the assumption, so we conclude that .
- •
Inductive step: We assume the claim holds for all and prove the inductive step for even and odd steps independently.
() We prove that if is finite in (and therefore would be removed in ), then either , or is also finite in . For the sake of contradiction, we assume that is finite in and that is in , but is not finite there (and, therefore, ). Since is not finite in , there is an infinite path from in . Because , it follows that there is an infinite path in ( is not in because is finite there). Using Lemma 5 (possibly multiple times) and the fact that is finite, we can find vertices in and in s.t. and is not in , therefore, for some . Because and it is not finite ( is infinite), it follows that for some , and since , we have that , implying , which is in contradiction to the induction hypothesis.
We prove that if is endangered in (and therefore would be removed in ), then either , or is removed at the latest in . For the sake of contradiction, assume that is endangered in while is removed later than in . Therefore, since contains no finite vertices (they were removed in the -th step), there is an infinite path from s.t. contains at least one accepting state. Because , there is a corresponding path from in that also contains at least one accepting state and moreover . Since has an infinite number of states (and at least one accepting), not all states from were removed in , i.e., there is at least one node with rank less or equal to . Using Lemma 5 (also possibly multiple times) we can hence find states in and in s.t. and is not in and has a rank less or equal to , therefore, for some . Because , it follows that for some , and, therefore, , which is in contradiction to the induction hypothesis.
This concludes the proof. ∎
Lemma 7
Let such that and be the run DAG of over . For all , it holds that .
Proof
Can be obtained as a simplified version of the proof of Lemma 6. ∎
We are now ready to prove Lemma 3.
See 3
Proof
Follows directly from the fact that is obtained by removing states from .
Let . As shown in the proof of Lemma 3.2 in [33], there are two cases. The first case is when all vertices of are finite, which we do not need to consider, since we assume complete automata.
The other case is when contains an infinite vertex. In this case, contains an accepting run
[TABLE]
with
- –
, and ,
- –
for all ,
and, for all ,
- –
if or
if , respectively,
- –
is the -tight level ranking that maps each to the rank of ,
- –
if or
if , respectively.
The ranks assigned by to states of match the ranks of the corresponding vertices in .
Using Lemma 6, we conclude that contains no macrostate where and for . Therefore, is also an accepting run in . (We use to refer to this paragraph later.) ∎
See 2
Proof
The same as for Lemma 3 with substituted by the following:
Using Lemma 7, we conclude that contains no macrostate where and for . So is also an accepting run in .∎
See 4
Proof
The same as for Lemma 3 with substituted by the following:
Using Lemmas 7 and 6, we conclude that contains no macrostate where either and , or and for . Therefore, is also an accepting run in . ∎
5 Saturation of Macrostates
Our second optimisation is inspired by an optimisation of determinisation of classical finite automata from [17, Section 5]. Their optimisation is based on saturating every constructed macrostate in the classical subset construction with all direct-simulation-smaller states. This can reduce the total number of states of the determinized automaton because two or more macrostates can be mapped to a single saturated macrostate. (In Section 5.2, we show why an analogue of their compression cannot be used.)
We show that a similar technique can be applied to BAs. We do not restrain ourselves to direct simulation, though, and generalize the technique to delayed simulation. In particular, in our optimisation, we saturate the components of macrostates obtained in Comp with all -smaller states. Formally, we modify Comp by substituting the definition of the constructed transition function with defined as follows:
- •
where
- –
with ,
- –
with , and
- –
with iff , , and
and if or
- *
and if ,
where . We denote the obtained procedure as Saturate and the obtained BA as .
Lemma 8 ()
**
Obviously, as direct simulation is stronger than delayed simulation, the previous technique can also use direct simulation only (e.g., when computing the full delayed simulation is computationally too demanding). Moreover, Saturate is also compatible with all Purgex algorithms for (because they just remove macrostates with incompatible rankings from )—we call the combined versions Purgex +Saturate and the complement BAs they output .
Lemma 9 ()
**
5.1 Proofs of Lemmas 8 and 9
We start with a lemma, used later, that talks about languages of states related by delayed simulation when there is a path between them.
Lemma 10
For such that , let and . Then .
Proof
First we prove the following claim:
Claim
For every word where , we can construct a trace over such that and for all .
Proof: We assign and construct the rest of by the following inductive construction.
- •
Base case: () From the assumption it holds that and . From Lemma 5 there is some s.t. and . We assign , so .
- •
Inductive step: Let be a prefix of a trace such that for every . From the transitivity of , it follows that . From Lemma 5 there is some s.t. and . We assign , so .
Consider a word such that for . We show that . According to the previous claim, we can construct a trace over s.t. and for all . Since , from Lemma 5 it follows that we can construct a trace s.t. for every . Because contains infinitely often a subword from , there is some such that and for . Note that it holds that . We can again use the claim above to construct a trace over such that and for all . Since , we can simulate from by a trace , and because , we know that will touch an accepting state in finitely many steps (this holds because is from , which are the words over which we can go from to and touch an accepting state). Consider such that is the first state after the accepting state that is one of the in . This reasoning could be repeated for all occurrences of a subword from in , therefore . ∎
Next, we give a lemma used for establishing correctness of saturating macrostates with -smaller states.
Lemma 11
Let such that and . Further, let where . Then .
Proof
Clear.
Consider some and an accepting trace in over . There are two cases:
( contains only finitely many transitions )
In this case, is of the form where is a finite prefix , for , and is an infinite trace from that does not contain any occurrence of the transition . We construct in a trace as follows. Let be a strategy for . We set , so . Since , it follows that there is such that . For , we set . By induction, it follows that , in particular . We set . Since starts in and contains infinitely many accepting states and starts in and , then also contains infinitely many accepting states. It follows that is accepting, so . 2. 2.
( contains infinitely many transitions )
In this case, is of the form , for and Since is accepting, for infinitely many , we have in and hence also in the original BA . Using Lemma 10 and the fact that , we have and hence . ∎
The following lemma guarantees that adding transitions in the way of Lemma 11 does not break the computed delayed simulation and can, therefore, be performed repeatedly, without the need to recompute the simulation.
Lemma 12
Let be the delayed simulation on . Further, let be such that and , and let where . Then is included in the delayed simulation on .
Proof
Let be a dominating strategy compatible with and be a strategy defined for all such that as when and . Note that is also dominating wrt . This can be shown by the following proof by contradiction: Suppose is not dominating; then there is a strategy such that must be simulated by . But then must also (transitivity of simulation) be simulated by , so is not dominating. Contradiction.
Further, let be such that . Let be a trace over in such that is an accepting state and does not contain any occurrence of . Further, let be a trace corresponding to a run over in , where , constructed as .
Claim
There is a trace over such that contains an accepting state and is -simulated by at every position.
Proof: We have the following two cases:
- •
( does not contain any occurrence of )
Let be a trace in over obtained from by starting with its prefix up to , taking , and continuing with . Since in , it holds that is at the same position as in , the first part of the claim holds. Further, clearly -simulates on , and because simulates by a transition to a state such that and is constructed using , then also the second part of the claim holds.
- •
( contains at least one occurrence of )
Suppose that starts with such that does not contain any . Then let us start building such that it starts with . On this prefix, is clearly -simulated by the corresponding prefix of . We continue from using the strategy . In particular, the next time we reach in while we are at some state such that , we simulate the transition by and so on. We can observe that when we arrive to in , we also arrive to in such that . Therefore, contains an accepting state. Moreover, since is dominating, the second part of the claim also holds.
From the claim above, it follows that the trace contains an accepting state, so . ∎
Finally, we are ready to prove Lemma 8.
See 8
Proof
Let and be an arbitrary accepting run over in such that . For the sake of contradiction, assume that , therefore, there is a run on in having infinitely many accepting states. From the fact that tight level rankings form a non-increasing sequence, we have that . This sequence eventually stabilizes and from the property of level rankings and the fact that is accepting, it stabilizes in some such that is even. This, however, means that the component of macrostates in cannot be emptied infinitely often, and, therefore, is not accepting, which is a contradiction. Hence , so (from Proposition 1) .
Consider some . Let be a BA obtained from by adding transitions according to Lemma 12. Then from Lemma 11, we have that . Therefore, where is the BA obtained from using Comp. It is easy to see that we can construct a run in that mimics the levels of run DAG of in (i.e., we are able to empty the component infinitely often). Hence . ∎
See 9
Proof
This part is the same as in the proof of Lemma 8.
Consider some . Let be a BA obtained from by adding transitions according to Lemma 12. Then from Lemma 11, we have that . Therefore, where is the BA obtained from using Comp. It is easy to see that we can construct a run in that mimics the levels of run DAG of in (i.e., we are able to empty the component infinitely often). Using Lemmas 7 and 6, we can conclude that the run contains no macrostate of the form , where and , or and for . Therefore, is also an accepting run in . Hence . ∎
5.2 Remarks on Compression of Macrostates
An analogy to saturation of macrostates is their compression [17, Section 6], based on removing simulation-smaller states from a macrostate. This is, however, not possible even for direct simulation, as we can see in the following example.
Example 1
Consider the BA over given below.
p$$q$$r$$a$$a$$a$$a$$a
For this BA we have and . If we compress the macrostates obtained in Comp, there is the following trace in the output automaton:
[TABLE]
This trace contains infinitely many final states (we flush the -set infinitely often), hence we are able to accept the word , which is, however, in the language of the input BA. ∎
6 Use after Simulation Quotienting
In this short section, we establish that our optimizations introduced in Sections 4 and 5 can be applied with no additional cost in the setting when BA complementation is preceded with simulation-based reduction of the input BA (which is usually helpful), i.e., when the simulation is already computed beforehand for another purpose. In particular, we show that simulation-based reduction preserves the simulation (when naturally extended to the quotient automaton). First, let us formally define the operation of quotienting.
Given an -simulation for , we use to denote the -similarity relation (i.e., the symmetric fragment) . Note that since is a preorder, it holds that is an equivalence. We use to denote the equivalence class of wrt . The quotient of a BA wrt is the automaton
[TABLE]
with the transition function and the set of initial and accepting states and respectively.
Proposition 2 ([7], [12])
If , then .
Remark 1 ([12])
Finally, the following lemma shows that quotienting preserves direct and delayed simulations, therefore, when complementing , it is possible to first quotient wrt a direct/delayed simulation and then use the same simulation (lifted to the states of the quotient automaton) to optimize the complementation.
Lemma 13 ()
Let be the -simulation on for . Then the relation defined as iff is the -simulation on .
Proof
First, we show that is well defined, i.e., if , then for all and , it holds that . Indeed, this holds because and , and therefore ; the transitivity of simulation yields .
Next, let be a strategy that gives . Consider a trace defined as over a word in . Then,
for there is a trace in s.t. and for . Therefore, if is accepting then so is ; 2. 2.
for there is a trace in s.t. , for and, moreover, if is accepting then there is for s.t. .
Further, let . Then there is a trace simulating in from . Further, consider its projection into . For all , we have that , and therefore also . Since , then also .
Finally, we show that is maximal. For the sake of contradiction, suppose that is -simulating for some s.t. . Consider a word and a trace over in . Then there is a trace over in . According to the assumption, there is also a trace such that is -simulating . But then there will also exist a trace such that for all and (see the previous part of the proof). Therefore, since is maximal, we have that , which is in contradiction with the assumption. ∎
7 Experimental Evaluation
We implemented our optimisations in a prototype tool222https://github.com/vhavlena/ba-complement written in Haskell and performed preliminary experimental evaluation on a set of 124 random BAs with a non-trivial language over a two-symbol alphabet generated using Tabakov and Vardi’s model [37]. The parameters of input automata were set to the following bounds: number of states: 6–7, transition density: 1.2–1.3, and acceptance density: 0.35–0.5. Before complementing, the BAs were quotiented wrt the direct simulation for experiments with Purgedi and the delayed simulation for experiments with Purgede and Purgedi+de. The timeout was set to 300 s.
We present the results for our strongest optimizations for outputs of the size up to 500 states in Fig. 1. As can be seen in Fig. 1(a), purging alone often significantly reduces the size of the output. The situation with saturation is, on the other hand, more complicated. In Fig. 1(b), we can see that in some cases, the saturation produces even smaller BAs than only purging, on the other hand, in some cases, larger BAs are produced. This is expected, because saturating the component of macrostates also means that more level rankings (the component) need to be considered.
For outputs of a larger size (we had 11 of them), the results follow a similar trend, but the probability that saturation will increase the size of the result decreases. For some concrete results, for one BA, the size of the output BA decreased from 4065 (Comp) to 985 (Purgedi+de) to 929 (Purgedi+de +Saturate), which yields a reduction to 24 %, resp. 22 %! Further, we observed that all Purgex methods usually give similar results, with the difference of only a few states (when Purgedi and Purgede differ, Purgedi usually wins over Purgede).
8 Related Work
BA complementation has a long research track. Known approaches can be roughly classified into Ramsey-based [34], determinization-based [32, 29], rank-based [33], slice-based [23, 39], learning-based [25], and the recently proposed subset-tuple construction [4]. Those approaches build on top of different concepts of capturing words accepted by a complement automaton. Some concepts can be translated into others, such as the slice-based approach, which can be translated to the rank-based approach [40]. Such a translation can help us get a deeper understanding of the BA complementation problem and the relationship between optimization techniques for different complementation algorithms.
Because of the high computational complexity of complementing a BA, and, consequently, also checking BA inclusion and universality (which use complementation as their component), there has been some effort to develop heuristics that help to reduce the number of explored states in practical cases. The most prominent ones are heuristics that leverage various notions of simulation relations, which often provide a good compromise between the overhead they impose and the achieved state space reduction. Direct [7, 36], delayed [12], fair [12], their variants for alternating Büchi automata [16], and multi-pebble simulations [13] are the best-studied relations of this kind. Some of the relations can be used quotienting, but also for pruning transitions entering simulation-smaller states (which may cause some parts of the BA to become inaccessible). A series of results in this direction was recently developed by Clemente and Mayr [10, 26, 27].
Not only can the relations be used for reducing the size of the input BA, they can also be used for under-approximating inclusion of languages of states. For instance, during a BA inclusion test , if every initial state of is simulated by an initial state of , the inclusion holds and no complementation needs to be performed. But simulations can also be used to reduce the explored state space within, e.g., the inclusion check itself, for instance in the context of Ramsey-based algorithms [1, 2]. Ramsey-based complementation algorithms [34] in the worst case produce states, which is a significant gap from the lower bound of Michel [28] and Yan [41]. The Ramsey-based construction was, however, later improved by Breuers et al. [5] to match the upper bound . The way simulations are applied in the Ramsey-based approach is fundamentally different from the current work, which is based on rank-based construction. Taking universality checking as an example, the algorithm checks if the language of the complement automaton is empty. They run the complementation algorithm and the emptiness check together, on the fly, and during the construction check if a macrostate with a larger language has been produced before; if yes, then they can stop the search from the language-smaller macrostate. Note that, in contrast to our approach, their algorithm does not produce the complement automaton.
9 Conclusion and Future Work
We developed two novel optimizations of the rank-based complementation algorithm for Büchi automata that are based on leveraging direct and delayed simulation relations to reduce the number of states of the complemented automaton. The optimizations are directly usable in rank-based BA inclusion and universality checking. We conjecture that the decision problem of checking BA language inclusion might also bring another opportunities for exploiting simulation, such as in a similar manner as in [3]. Another, orthogonal, directions of future work are (i) applying simulation in other than the rank-based approach (in addition to the particular use within [1, 2]), e.g., complementation based on Safra’s construction [32], which, according to our experience, often produces smaller complements than the rank-based procedure, (ii) applying our ideas within determinization constructions for BAs, and (iii) generalizing our techniques for richer simulations, such as the multi-pebble simulation [13] or various look-ahead simulations [26, 27]. Since the richer simulations are usually harder to compute, it would be interesting to find the sweet spot between the overhead of simulation computation and the achieved state space reduction.
Acknowledgement
We thank the anonymous reviewers for their helpful comments on how to improve the exposition in this paper. This work was supported by the Ministry of Science and Technology of Taiwan project 106-2221-E-001-009-MY3 the Czech Science Foundation project 19-24397S, the FIT BUT internal project FIT-S-17-4014, and The Ministry of Education, Youth and Sports from the National Programme of Sustainability (NPU II) project IT4Innovations excellence in science—LQ1602.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Abdulla, P.A., Chen, Y., Clemente, L., Holík, L., Hong, C.D., Mayr, R., Vojnar, T.: Simulation Subsumption in Ramsey-based Büchi Automata Universality and Inclusion Testing. In: Proc. of CAV’10. pp. 132–147. Springer (2010)
- 2[2] Abdulla, P.A., Chen, Y., Clemente, L., Holík, L., Hong, C., Mayr, R., Vojnar, T.: Advanced Ramsey-based Büchi Automata Inclusion Testing. In: Proc. of CONCUR’11. pp. 187–202. Springer (2011)
- 3[3] Abdulla, P.A., Chen, Y., Holík, L., Mayr, R., Vojnar, T.: When Simulation Meets Antichains. In: Proc. of TACAS’10. pp. 158–174. Springer (2010)
- 4[4] Allred, J.D., Ultes-Nitsche, U.: A Simple and Optimal Complementation Algorithm for Büchi Automata. In: Proc. of the 33rd Annual ACM/IEEE Symposium on Logic in Computer Science. pp. 46–55. ACM (2018)
- 5[5] Breuers, S., Löding, C., Olschewski, J.: Improved Ramsey-Based Büchi Complementation. In: Proc. of FOSSACS’12. pp. 150–164. Springer (2012)
- 6[6] Büchi, J.R.: On a Decision Method in Restricted Second Order Arithmetic. In: Proc. of International Congress on Logic, Method, and Philosophy of Science 1960. Stanford Univ. Press, Stanford (1962)
- 7[7] Bustan, D., Grumberg, O.: Simulation-based Minimization. ACM Transactions on Computational Logic 4 (2), 181–206 (2003)
- 8[8] Cécé, G.: Foundation for a Series of Efficient Simulation Algorithms. In: Proc. of LICS’17. pp. 1–12 (2017)
