Deciding Fast Termination for Probabilistic VASS with Nondeterminism
Tom\'a\v{s} Br\'azdil, Krishnendu Chatterjee, Anton\'in Ku\v{c}era,, Petr Novotn\'y, Dominik Velan

TL;DR
This paper investigates the problem of determining whether probabilistic vector addition systems with states (pVASS) with nondeterminism have linear expected termination time, providing polynomial-time decidability results and a quadratic lower bound dichotomy.
Contribution
The paper introduces techniques for checking fast termination in pVASS with nondeterminism and establishes a polynomial-time decision procedure for linear expected termination time.
Findings
Decidability of linear expected termination time in certain pVASS classes.
A polynomial-time algorithm for checking fast termination.
A quadratic lower bound for non-linear expected termination times.
Abstract
A probabilistic vector addition system with states (pVASS) is a finite state Markov process augmented with non-negative integer counters that can be incremented or decremented during each state transition, blocking any behaviour that would cause a counter to decrease below zero. The pVASS can be used as abstractions of probabilistic programs with many decidable properties. The use of pVASS as abstractions requires the presence of nondeterminism in the model. In this paper, we develop techniques for checking fast termination of pVASS with nondeterminism. That is, for every initial configuration of size n, we consider the worst expected number of transitions needed to reach a configuration with some counter negative (the expected termination time). We show that the problem whether the asymptotic expected termination time is linear is decidable in polynomial time for a certain natural…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
\usetkzobj
all
11institutetext: Faculty of Informatics, Masaryk University
11email: {xbrazdil,tony,petr.novotny,xvelan1}@fi.muni.cz22institutetext: IST Austria
22email: [email protected]
Deciding Fast Termination for Probabilistic VASS with Nondeterminism††thanks: Tomáš Brázdil and Antonín Kučera are supported by the Czech Science Foundation Grant No. 18-11193S. Krishnendu Chatterjee is supported by the Austrian Science Fund (FWF) NFN Grants S11407-N23 (RiSE/SHiNE). Petr Novotný and Dominik Velan are supported by the Czech Science Foundation Grant No. GJ19-15134Y.
Tomáš Brázdil 11
Krishnendu Chatterjee 22
Antonín Kučera 11
Petr Novotný 11
Dominik Velan 11
Abstract
A probabilistic vector addition system with states (pVASS) is a finite state Markov process augmented with non-negative integer counters that can be incremented or decremented during each state transition, blocking any behaviour that would cause a counter to decrease below zero. The pVASS can be used as abstractions of probabilistic programs with many decidable properties. The use of pVASS as abstractions requires the presence of nondeterminism in the model. In this paper, we develop techniques for checking fast termination of pVASS with nondeterminism. That is, for every initial configuration of size n, we consider the worst expected number of transitions needed to reach a configuration with some counter negative (the expected termination time). We show that the problem whether the asymptotic expected termination time is linear is decidable in polynomial time for a certain natural class of pVASS with nondeterminism. Furthermore, we show the following dichotomy: if the asymptotic expected termination time is not linear, then it is at least quadratic, i.e., in .
Keywords:
angelic and demonic nondeterminism termination time probabilistic VASS
1 Introduction
Probabilistic Programs & VASS Probabilistic systems play an important role in various areas of computing such as machine learning [26], network protocol design [25], robotics [45], privacy and security [5], and many others. For this reason, verification of probabilistic systems receives a considerable attention of the verification community. As in the classical (non-probabilistic) setting, in probabilistic verification one typically constructs a suitable abstract model over-approximating the real behaviour of the system. In the past, the verification research was focused mostly on finite-state probabilistic models [4] as well as some special infinite-state classes, such as probabilistic one-counter [11] or pushdown automata [21, 24]. However, the recent proliferation of general, Turing-complete probabilistic programming languages (PPLs) necessitates the use of more complex models, that can encompass multiple potentially unbounded numerical variables.
In the classical setting, one of the standard formalisms used for program abstraction are vector addition systems with states (VASS) [32]. Intuitively, a VASS is a finite directed graph where every edge is assigned a vector of integer counter updates of a fixed dimension . A configuration is specified by a current state and a vector of current counter values . The computation proceeds by moving along the edges in the graph and performing the respective updates on the counters. Since VASS themselves are not Turing-complete, they have many decidable properties, and they have been successfully used as program abstractions in termination and complexity analysis [44] as well as for reasoning about parallel programs [23, 32] and parameterized systems [6, 2]. Applying such an abstraction to a probabilistic program yields a probabilistic VASS (pVASS), which allows for a probabilistic choice of a transition in some states. Moreover, during the abstraction, certain complex programming constructs such as if-then-else branching are replaced with nondeterministic choice. To ensure that the abstraction over-approximates the possible behaviour, we typically interpret the nondeterminism as demonic, i.e., the choice is resolved by adversarial environment. However, in certain settings it makes sense to consider angelic nondeterminism, to be resolved by a yet-to-be-designed controller (e.g., a scheduling mechanism in a queuing system).
Termination Complexity One of the fundamental problems in program analysis is to evaluate a given program’s runtime. In the classical setting, this problem emerges in various flavours, ranging from worst-case execution time-analysis [47, 13] in real-time systems to obtaining bounds on the number of execution steps [27], analysing asymptotic [16], or amortized complexity [28]. VASS-based abstractions were successfully used in the latter scenario [44].
Recently, several approaches to reason about the expected runtime of probabilistic programs were developed [31, 40]. The analysis is much more demanding than in the classical case. For instance, deciding whether the expected runtime is finite is harder (i.e. higher in the arithmetic hierarchy) than deciding whether a probabilistic program terminates with probability one [30]. Additional obstacle is the inherent non-compositionality of expected runtimes. The work [31] gives an example of two programs, , and , which both consist of a single loop (i.e. they have a strongly connected control flow graph) and whose expected runtime is linear in the magnitude of initial variable valuations; but running after yields the program whose expected runtime is infinite.
These intricacies spawn fundamental questions about probabilistic models, which we aim to address: Is there a sufficiently powerful probabilistic formalism where a fast (i.e., linear-time) termination from an arbitrary initial configuration is decidable? Can the decision procedure proceed by analysing individual strongly-connected components and composing the results? Can we provide a lower bound on the expected runtime in the case that it is not linear? These questions were previously considered in the non-probabilistic setting, namely in the domain of VASS [10]. In this paper, we investigate them in the probabilistic context.
Our Setting We show that the above questions can be answered affirmatively in the domain in pVASS with nondeterminism, which are Markov decision processes over VASS where the nondeterministic choice is resolved either demonically (i.e. the nondeterminism tries to prolong the computation) or angelically. We consider a basic variant of VASS termination: the zero termination, where the computation stops when some counter becomes negative. The termination complexity of a given pVASS is a function assigning to every the maximal/minimal (in the demonic/angelic case) expected length of a computation initiated in a configuration of size (the size of is defined as the maximal component of ), where the maximum/minimum is taken over all the strategies of the environment (we consider unrestricted, i.e., history-dependent and randomized, strategies).
Our Results For strongly connected pVASS which contain either a demonic or an angelic non-determinism (but not both) we show that
The problem whether is decidable in polynomial time. 2. 2.
If , then . 3. 3.
If , then for every , the probability of all computations of length at least converges to one as , (in the demonic case, this requires the environment to use appropriate strategies).
According to 2., implies that is “at least quadratic”. However, 3. does not follow from 2. (a more detailed discussion is postponed to Section 3).
We also show that the above results hold in general VASS with angelic nondeterminism, while in the demonic setting they extend to a restricted class pVASS whose maximal end-component (MEC) decomposition yields a directed acyclic graph (DAG), in which case 1. can be solved compositionally by analysing individual MECs. Finally, we show that in pVASS whose MEC-decomposition is not DAG-like, the demonic complexity cannot be decided by analysis of individual MECs, since such VASS can emulate the non-compositional example of [31].
The results build on analogous results for non-probabilistic VASS established in [10], combining them with a novel probabilistic analysis.
Paper Organization. After presenting preliminaries in Section 2, we focus on the demonic case which contains the main technical contributions. Subsection 3.1 provides an intuitive outline of our techniques. Subsection 3.2 develops the algorithm for proving linear termination complexity and shows its soundness (i.e. that a yes-answer indeed proves ). Subsection 3.3 deals with the quadratic lower bound, showing the completeness of our algorithm, and Subsection 3.4 discusses extension of the results to the angelic case. Finally, in Section 4 we extend the techniques to DAG-like VASS MDPs and discuss the difficulties arising in general VASS. Missing proofs are provided in the appendix.
Related Work. The termination problems (counter-termination, control-state termination) for classical VASS as well as the related problems of boundedness and coverability have been studied very intensively in the last decades, see, e.g., [38, 42, 20, 22, 7]. The complexity of the termination problem with fixed initial configuration is EXPSPACE complete [38, 48, 3]. The more general reachability problem is also decidable [39, 35, 33], but computationally hard [38, 19]. The best known upper bound is Ackermannian [37] (see [43] for an overview of hyper-Ackermannian complexity hierarchies).
The problem of existence of infinite computations in VASS has been also studied in the literature. Polynomial-time algorithms have been presented in [14, 46] using results of [34]. In the more general context of games played on VASS, even deciding the existence of infinite computation is coNP-complete [14, 46], and various algorithmic approaches based on hyperplane-separation technique have been studied in [15, 29, 18].
The study on asymptotic termination complexity of non-probabilistic VASS, initiated in [10] was continued in [36], where the existence of some such that was also shown decidable in polynomial time.
Concerning expected runtime analysis, we note the work [17] which presents a sound (but incomplete) technique for obtaining near-linear asymptotic bounds on recurrence relations arising from certain types of probabilistic programs.
2 Preliminaries
We use , , , and to denote the sets of non-negative integers, integers, rational numbers, and real numbers. Given a function , we use and to denote the sets of all such that and for all sufficiently large , where are some positive constants. If and , we write .
Let be a finite index set. The vectors of are denoted by bold letters such as . The component of of index is denoted by . If the index set is of the form for some positive integer , we write instead of . For every , we use to denote the constant vector where all components are equal to . The scalar product of is denoted by , i.e., . The other standard operations and relations on such as , , or are extended to in the component-wise way. In particular, if for every index .
2.1 Markov Decision Processes
Definition 1
Let be a set of labels. A Markov decision process (MDP) with -labeled transitions is a tuple , where is a finite set of states split into two disjoint subsets and of nondeterministic and probabilistic states, is a finite set of labeled transitions such that every has at least one outgoing transition, and is a function assigning to each where a positive rational probability so that, for every , .
A state is an immediate successor of a state if there is a transition for some . A finite path in of length is a finite sequence of the form where and for all . If and , then is a cycle. An MDP is strongly connected if for each pair of distinct states there is a finite path from to . An infinite path in is an infinite sequence such that is a finite path for every . For a finite sequence of the form and a finite or infinite sequence of the form , where and are not necessarily paths in , we use to denote the concatenated sequence (we do not require ). If are both paths in and , then is also a path in .
A strategy is a function assigning to every finite path ending in a nondeterministic state a probability distribution over the outgoing transitions of . A strategy is Markovian (M) if it depends only on the last state , and deterministic (D) if it always selects some successor state with probability one. The set of all strategies is denoted by (the underlying is always clearly determined by the context). Every initial state and every strategy determine the probability space over infinite paths initiated in in the standard way, and we use to denote the associated probability measure.
2.2 Probabilistic VASS with Nondeterminism
Definition 2
Let . A -dimensional probabilistic VASS with non-determinism (VASS MDP) is an MDP where the set of labels is .
Let be a -dimensional VASS MDP. The encoding size of is denoted by , where the integers representing counter updates are written in binary. A configuration of is a pair , where and . If some component of is negative, then is terminal. The set of all configurations of is denoted by . The size of is . Given , we say that is -bounded if .
Every (finite or infinite) path and every initial vector determine the corresponding computation of , i.e., the sequence of configurations such that and . For every infinite computation , let be the least such that is terminal. If there is no such , we put .
Recall that every strategy and every determine a probability space over infinite paths initiated in with probability measure . Similarly, determines the unique probability space over all computations initiated in a given , and we use to denote the associated probability measure, and denotes the expected value of .
The angelic/demonic termination complexity of are the functions defined as follows, where is the set of all such that :
[TABLE]
We say that the expected angelic/demonic termination time of is linear if and , respectively.
3 Linearity of Demonic Termination Time
In this paper, we prove the following theorem:
Theorem 3.1
The problem whether the expected termination time of a given strongly connected VASS MDP is linear is decidable in polynomial time. If the expected termination time of is not linear, then . Furthermore, for every we have that
[TABLE]
The last part of Theorem 3.1 deserves some comments. Recall if there is such that for all sufficiently large . We prove by showing the existence of such that for all sufficiently large , where is a suitable strategy depending on (then, we can put ). Hence, does not imply that converges to as (for some constant ). The last part of Theorem 3.1 shows that for an arbitrarily small , we have that does converge to as . The question whether the convergence holds for remains open.
3.1 Outline of Techniques
The proof of Theorem 3.1 is non-trivial, and it is based on combining the existing techniques with new analysis invented in this paper. We use VASS MDPs and of Fig. 1 as running examples to illustrate our techniques.
A polynomial-time algorithm deciding asymptotic linearity of termination time for purely non-deterministic VASS (where the set is empty) was given in [10]. Theorem 3.1 generalizes this result to VASS MDPs. We start by recalling the results of [10] and sketching the main ideas behind the proof of Theorem 3.1. These ideas are then elaborated in subsequent sections.
Consider a purely non-deterministic VASS of dimension . A cycle of is simple if all are pairwise different. The total effect of a simple cycle, i.e., the sum , is called an increment. Clearly, there are only finitely many increments . In [10], it was shown that the termination time of is linear iff all increments are contained in an open half-space whose normal is strictly positive in every component. The “if” direction is immediate, relying on a straightforward “ranking” argument. The “only if” part is more elaborate. In [10], it was shown that if the increments are not contained in an open half-space with positive normal, then for all sufficiently large , there is a non-terminating computation initiated in whose length is at least for some constant . This computation consists of simple cycles and auxiliary short paths used to “switch” from one control state to another.
Now let be a VASS MDP with counters. Here, instead of simple cycles and their increments, we use the vectors of expected counter changes per transition induced by MD strategies in their BSCCs. More precisely, for each of the finitely many MD strategies and every BSCC of the finite-state Markov chain obtained by “applying” to , we consider the unique vector of expected counter changes per transition (note that is the same for almost all infinite computations initiated in a state of ). Thus, we obtain a finite set of vectors together with the associated set of tuples where each is an MD strategy and is a BSCC of (note that we can have for since MD strategies might have multiple BSCCs). Similarly as in [10], we check whether all are contained in an open half-space whose normal is strictly positive in every component. This is achievable in polynomial time by using the results of [8]. If such a exists, we can conclude . This is because the “extremal” vectors of expected counter changes per transition are obtained by MD strategies111Here we rely on well-known results about finite-state MDPs [41]., and hence the expected shift in the direction opposite to per transition stays bounded away from zero even for general strategies. We than use a submartingale-based argument to show that the expected termination time is linear. This proves the first part of Theorem 3.1.
Example 1
For the VASS MDP of Fig. 1, there are three different increments , , and . Hence, we can choose as a positive normal satisfying , , and . For the VASS MDP of Fig. 1, there are three different increments , , and , hence no positive normal satisfying and exists.
Now suppose there is no such . Recall that for purely non-deterministic VASS, a sufficiently long non-terminating computation initiated in consisting of simple cycles and short “switching” paths was constructed in [10]. Since are no longer effects of simple cycles or any fixed finite executions, it is not immediately clear how to proceed and we need to use new techniques. The arguments of [10] used to construct a sufficiently long non-terminating computation are purely geometric, and they do not depend on the fact that increments are total effects of simple cycles. Hence, by using the same construction, we obtain a sufficiently long sequence of vectors consisting of and some auxiliary elements representing switches between control states. We call this sequence a scheme, because it does not correspond to any real computation of in general. When the constructed scheme is initiated in , the resulting trajectory never crosses any axis. Also note that for every fixed , we can create an extra space between the trajectory and the axes by shifting the initial point from to , which does not influence our asymptotic bounds. Now, we analyze what happens if the constructed scheme is followed from . Here, following a vector means to execute the transition selected by , and following a “switch” from to means to execute a strategy which eventually reaches with probability one (we use a strategy minimizing the expected number steps needed to reach ). Using concentration bounds of martingale theory, we show that the probability of all executions deviating from the scheme by more than is bounded by for some fixed (assuming is sufficiently large), which yields the lower bound of Theorem 3.1. The last part of Theorem 3.1 is proven by a more detailed analysis of the established bounds.
Let us note that the underlying martingale analysis is not immediate, since the previous work which provides the basis for this analysis (such as [11]) typically assume that the analysed strategies are memoryless in the underlying finite state space. In contrast, strategies arising of schemes are composed of multiple memoryless strategies, with the switching rules depending on the size of the initial configuration. Hence, we take a compositional approach, analysing each constituent strategy separately using known techniques and composing the results via a new approach.
3.2 The Algorithm
In this section we prove the first part of Theorem 3.1. Our analysis uses results on multi-mean-payoff MDPs. Recall that if is an MDP with transitions labelled by elements of (for some dimension ), then a mean-payoff of an infinite path of is . We say that a given vector is achievable for if there exist a strategy and such that . Now we recall some results on mean-payoff MDPs [8] used as tools in this section.
(a)
There is a finite set of vectors such that the set of all achievable vectors is precisely the set of all such that for some , where is the convex hull of .
(b)
The problem whether a given rational is achievable is decidable in polynomial time.
Furthermore, we need the following result about finite-state MDPs.
Lemma 1
Let be a strongly connected MDP with labels from such that
[TABLE]
Let be a function assigning to every infinite path of the least such that . If there is no such , then . Then there exists a constant depending only on such that for every and we have that .
Now we show how to prove the first part of Theorem 3.1 using the results above. Let be a strongly connected VASS MDP. For each of the finitely many MD strategies we can consider a finite-state Markov chain obtained from by fixing in every the probability of transitioning to the unique successor specified by to 1. For each such and each its BSCC we consider the unique vector defined by
[TABLE]
where is the invariant (stationary) distribution over the states of (note that for every ). Thus, we obtain a finite set of increments together with the associated MD strategies and the BSCCs .
Lemma 2
If there exists a vector such that for every , then there exists such that for every and .
Proof
Let . Consider a -labelled MDP obtained from by replacing each counter update vector with the number . Note that every strategy for can be seen as a strategy for , and vice versa. For a given , we write and to denote the expected value of in and , respectively. Note that for every we have that .
For every , there is an optimal MD strategy maximizing the expected mean payoff in . Since is a convex combination of increments, we obtain . Now let be an arbitrary strategy. Since , we obtain . ∎
A direct corollary to Lemma 1 and Lemma 2 is the following:
Lemma 3
If there exists a vector such that for every , then holds for .
The next lemma leads to a sound algorithm for proving of linear termination complexity.
Lemma 4
The vector is achievable for iff there is no such that for every .
Proof
If is achievable, there exist and such that . Suppose there is such that for every . By Lemma 2, , which is a contradiction.
Now suppose is not achievable. Consider the (convex and compact) set of claim (a). Since is not achievable, the set has the empty intersection with the (convex) set of all vectors with non-negative components. By the hyperplane separation theorem, there exists a hyperplane with normal such that for all . Since every increment is achievable, there is such that . Hence, . ∎
Hence, to check linear termination complexity, our algorithm simply checks whether is achievable for . The previous lemma shows that this approach is sound. In the next subsection, we show that if there is no such that for every , then the expected termination time of is at least quadratic. This shows that our algorithm is also complete, i.e. a decision procedure for linear termination of strongly connected demonic VASS MDPs.
3.3 Quadratic Lower Bound
For the rest of this section, we fix a strongly connected VASS MDP . Let be the increments, and and the associated MD strategies and BSCCs introduced in Section 3.2.
Suppose that there does not exist a normal vector such that for every . By [10, Lemma 3.2]222Technically, Lemma 3.2 in [10] assumes for every . Here, . We can multiply all increments of by the least common multiple of all denominators and apply Lemma 3.2 afterwards., there exist a subset of increments and positive integer coefficients such that . We use this subset to construct a so-called scheme.
Scheme The definition of a scheme is parameterized by a certain function . This function is defined later, for now it suffices to know that . For every , we define the scheme for , which is a concatenation of identical -cycles, where each -cycle is defined as follows:
[TABLE]
The subsequence of the -th cycle is called the -th segment of the -th -cycle. Since the length of each -cycle is , the length of the scheme for is .
Example 2
Recall the VASS MDP of Fig. 1. Here, we put , , and . So, the cycle for is
[TABLE]
Note that the scheme does not necessarily correspond to any finite path in , even if the switches are disregarded. However, the scheme for determines a unique strategy for defined below.
From Schemes to Strategies. For every , we fix an MD strategy such that for every , the probability of visiting from is equal to one. Furthermore, we fix some state for every .
For all finite paths that are not initiated in , the strategy is defined arbitrarily. Otherwise, starts by simulating the strategy for precisely steps. Then, remembers the state in which the simulation of ended, and changes to simulating until the state of is reached. After reaching , the strategy simulates for precisely steps. Then, it again remembers the final state and starts to simulate until is reached, and so on, until the simulation of corresponding to the -th segment of the first -cycle is completed. Then, starts to simulate the switch of the first -cycle, i.e., the strategy . This completes the simulation of the first -cycle. In general, the -th -cycle (for ) is simulated in the same way, the only difference is that every switch is simulated by where is the state entered when terminating the simulation of in the -th -cycle. This goes on until all -cycles of the scheme are simulated. After that, behaves arbitrarily.
Lower Bound. We now show that the family of strategies witnesses the quadratic complexity. First we define . From standard results on MDPs [41] we know that for every , the expected number of steps we keep playing before hitting is finite and dependent only on . Hence, there exists a constant depending only on such that also the expected change of every counter incurred while simulating is bounded by . Now let , i.e., is the minimal counter update over all transitions, and let
[TABLE]
The function has been chosen so that, for all sufficiently large , if the scheme for is “executed” from the point , i.e., if we follow the vectors of the scheme, where each switch is replaced with the vector , then the resulting trajectory never crosses any axis (recall that ).
Example 3
A trajectory for the scheme of Example 2 is shown in Fig. 3. Here, , because performing every switch takes just one transition with expected change of the counters equal to .
Definition 3
Let be a finite alternating sequence of states and vectors of (not necessarily a finite path in ), and . We say that is -safe if, for every , we have that . Furthermore, we say that an infinite sequence is safe-until if its prefix is -safe.
Now consider an infinite path in initiated in . Then almost all such ’s (w.r.t. the probability measure ) can be split into a concatenation of sub-paths
[TABLE]
where is a path with precisely transitions (resulting from simulation of ), is a switching path performing the switch of the -th cycle, and is the remaining infinite suffix of . Note that for every , the paths can be concatenated and form a single path in of length . This follows from the way of scheduling the switching strategies in . Writing (where is the suffix of defined above), we denote by the length of . Note that for almost all .
We now focus on proving the following lemma:
Lemma 5
For every there exist such that for all , the probability of all infinite paths initiated in that are -safe until is at least . Moreover, the is independent of .
The lemma guarantees that if the strategy is executed in a configuration , where , then . This implies . Hence, it remains to prove the lemma.
Proof of Lemma 5. We separately bound the probabilities of “large counter deviations” while simulating the ’s and the switching strategies. To this end, for every let be the finite path of length obtained by concatenating all . Furthermore, let the sequence obtained from by replacing every with . Intuitively, is where the transition effects are “compensated” by subtracting the expected change in the counter values per transition. We prove the following:
Lemma 6
For every , there exist such that for all it holds \mathcal{P}_{p_{1}}^{\eta_{n}}(\{\pi\mid\mathit{Ipath}^{i}(\pi)\text{ is c\cdot n-safe}\})\geq 1-\delta. Moreover, the does not depend on .
In the proof of Lemma 6, we use the martingale defined for stochastic one-counter automata in [11]. Intuitively, if is safe, then it must be safe in every counter. Hence, we can consider each counter one by one, abstract the other counters, and estimate the probability of being safe in each of these one-counter automata.
Similarly, we need to estimate the probability of deviating from the trajectory by performing the switches. Let be the concatenation of all where and preserving their order. We prove the following:
Lemma 7
For every , there exist such that for all it holds \mathcal{P}_{p_{1}}^{\eta_{n}}(\{\pi\mid\mathit{Spath}^{i}(\pi)\text{ is c\cdot n-safe}\})\geq 1-\delta. Moreover, the does not depend on .
Clearly, if is -safe for all and is -safe, then is -safe until . Hence, Lemma 5 is a simple consequence of Lemma 6 and Lemma 7.
Probability of Quadratic Behaviour. Now we indicate how to prove the last part of Theorem 3.1. Directly from Lemma 5, we have that . However, observe that if is not a fixed constant, we cannot say that the size of the initial configuration is linear in . Taking for a suitable , we may rewrite the limit in the following way: . It can be shown that , for every sufficiently large , thus obtaining the last part of the Theorem 3.1.
3.4 Linearity of Angelic Termination Time
For angelic nondeterminism, we have a similar result as in the demonic one.
Theorem 3.2
The problem whether the expected angelic termination time of a given strongly connected VASS MDP is linear is decidable in polynomial time. If the expected angelic termination time of is not linear, then . Furthermore, for every we have that
[TABLE]
Proof (Sketch)
We analyse each counter, i.e., we consider one-dimensional VASS MDPs obtained by projecting the labelling function of .
If it is possible to terminate in one of these one-dimensional VASS MDPs in expected linear time, then the corresponding strategy achieves linear termination also in . On the other hand, if this is not possible, then every one-counter has infinite angelic termination complexity. This does not mean that the has infinite angelic termination complexity. However, we show that there exists a constant such that for sufficiently large initial configuration, the probability of runs terminating before transitions is sufficiently small for every one-counter. By union bound, the probability of runs terminating before in is for some . Thus, . The last part of the theorem is proved similarly to the demonic case. ∎
4 General VASS MDPs & Conclusion
We now drop the assumption that the VASS is strongly connected. Recall that an end-component in an MDP is a set of states that is closed (i.e., for at least one outgoing transition goes to , while for all the outgoing transitions must end in ) and strongly connected. A maximal end component (MEC) is an EC which is not contained in any larger EC. A decomposition of an MDP into MECs can be computed in polynomial time by standard algorithms [1], and each MEC of a VASS MDP induces a strongly connected VASS sub-MDP which can be analyzed as shown in previous sections. We can construct a graph whose vertices correspond to MECs of an MDP and there is an edge from to some other if and only if is reachable from . If the only cycles in this graph are self-loops, we say that the original MDP is DAG-like. MECs corresponding to “leafs” of the graph (i.e. MECs that cannot be exited) are called bottom MECs.
Theorem 4.1
Theorem 3.1 holds also for DAG-like VASS MDPs, while Theorem 3.2 holds for all VASS MDPs. In particular, a DAG-like VASS MDP has if and only if each MEC of induces a (strongly connected) VASS MDP in which ; and has iff each bottom MEC of has . Otherwise, the termination complexity of is in .
Proof (Sketch)
We sketch the proof for the demonic case where there are no self-loops in the MEC graph. Then no MEC can be re-entered once left. Moreover, there is a constant s.t. whenever we enter a MEC with a counter valuation , the expected time to either terminate or exit the MEC, as well as the expected size of the counter valuation at the time of termination/exiting are bounded by . Hence, a straightforward induction on the number of MECs shows that the expected maximal counter value as well as the expected termination time are bounded by from any initial configuration of size . Since does not depend on , we get the result.
For non-DAG-like VASS MDPs, the situation gets much more complicated. Consider the MDP in Figure 3. There are three MECs, each a singleton (, , ). Clearly all these three MECs have a linear termination complexity. Now consider the following demonic strategy starting in configuration : select the loop until we get the configuration ; then transition to and play its loop until we get into ; then transition to and if the randomness takes us back to , play the loop again until we get , etc. ad infinitum. Clearly, the strategy eventually ends up in where it terminates. However, the expected termination time is at least
Hence, proving the linear termination complexity in general VASS does not reduce to analysing individual MECs. Moreover, it crucially depends on the concrete probabilities in transient (non-MEC) states: in Figure 3, the termination time would be finite (and linear) if the transition from to had probability . The transient behaviour of MDPs can be of course rather complex and it is not even clear whether the linear demonic termination complexity is even decidable for VASS MDPs with general structure. We see this as a very intriguing, yet complex, direction for future work.
Appendix 0.A Proofs
We start by recalling basic notions of martingale theory. A stochastic process is a martingale if the following holds for all :
- •
,
- •
.
By weakening the second condition into , we obtain a submartingale.
If is a (sub)martingale such that almost surely for all , then the Azuma-Hoeffding inequality says that, for every ,
- •
if is a martingale,
- •
if is a submartingale.
0.A.1 A proof of Lemma 1
We start by recalling the results of [41]. Let be a strongly connected -labeled MDP. Consider the following linear program:
[TABLE]
This linear program is feasible, and the minimal value of is equal to
[TABLE]
Let be the components of an optimal solution, and let be some fixed initial state. For every , let and be functions assigning to every infinite path initiated in the state and the sum , respectively (we put ). Let be an arbitrary strategy. Almost identical computation as in [9, 12] gives that the stochastic process , where , is a supermartingale (over the probability space determined by ).
Our aim is to show that there exists and depending only on such that for an arbitrary strategy , every , and all we have that . From this we immediately obtain
[TABLE]
for some constant depending only on .
Now consider the supermartingale of the first paragraph applied to infinite paths in (under the strategy ). Let be an infinite path such that . Then for some fixed depending only on , because the rewards are bounded. Furthermore, the maximal difference between and for is also bounded by some constant depending only on . Hence, for some constant depending only on (recall that is negative). For every where , we obtain where . Thus, we obtain
[TABLE]
Since the supermartingale can change in one step at most by a constant (depending only on ), applying Azuma’s inequality yields
[TABLE]
where , for all .
0.A.2 A proof of Lemma 6
We assume a fixed . Recall the strategy and the BSCC associated to the increment (see Section 3.2). For every path initiated in , let , and for every , consider the sequence obtained by projecting every to its -th component, i.e., .
Our aim is to show that, for every and every , there exist such that the probability of all initiated in such that is safe is at least . Observe that Lemma 6 is a direct consequence of this claim.
Observe that , where the nondeterministic choice is resolved by , and the counter update vectors are projected to their th component, can be seen as a one-counter automaton. The long-run average change of the counter per transition in this automaton is . Recall that was obtained from the concatenated paths by subtracting the increment from each vector occurring in this sequence. Hence, we need to subtract from every counter update on every transition of . Thus, we obtain a one-counter automaton . A trivial but crucial observation is that the long-run average change of the counter per transition in is zero.
The probability of all initiated in such that is not safe is equal to the probability that a run of initiated in , where is the starting state of , decreases the counter to or below during the first transitions. An upper bound on the latter probability can be established using the martingale for probabilistic one-counter automata introduced in [11] (here we use a slightly modified version of this martingale which better suits our purposes). Due to [11], for every state of and every , there exists a vector such that the stochastic process defined by
[TABLE]
is a martingale, where , is a random variable returning the accumulated counter change after steps, and is a random variable returning the control state entered after steps. Moreover, the vector satisfies , where is the minimum probability used in the transitions of .
Note that if the accumulated counter change drops to or below for the first time after exactly transitions, the martingale does not change its value from this point on, and remains equal to . Let . If then the value is at least for every . Hence, the probability that a run of initiated in decreases the counter to or below during the first transitions is less or equal to . Furthermore, for all we have that
[TABLE]
by applying Azuma’s inequality, where is a suitable constant dependent only on .
The above holds for every . Hence, the probability of all probability of all initiated in such that is not safe is less or equal to for every sufficiently large . To achieve , we can put .
0.A.3 A proof of Lemma 7
First, we bound the expected number of transitions used in executing one switch. Let be the minimum probability appearing in the VASS MDP, and let . There is a path of length at most which we may follow with probability at least . If successful, we are done, otherwise we end up in some state and again there is some path from to of length at most and the probability of traversing this path is still at least .
If we use the number as the (lower bound on the) probability of success, the random variable counting the number of attempts until the first success has a geometric distribution. Its expected value is then . Since every attempt uses at most transitions, the expected number of used transitions is bounded from above by .
Let be the random variable equal to the length of . Since the number of switches is , the expected value of is bounded from above by . Let . Surely . By Markov inequality, we obtain that . Therefore, with probability at most , we use more than transitions.
The minimal update over all transitions is . If , then since no transition can decrease the counters and the set of paths such that is 0 safe has probability one. For , the set of paths such that is safe has probability at least . Since , taking completes the proof.
0.A.4 A proof of the last part of Theorem 3.1
We show that for every we can choose such that
[TABLE]
From Lemma 5, we have that . Rewriting the limit, we obtain
[TABLE]
Let . Then by the definition of (for sufficiently large). We need to show that for some . Surely
[TABLE]
For sufficiently large, we have . Therefore,
[TABLE]
Let be such that , therefore . Multiplying by we get . Therefore, . This completes the proof.
0.A.5 A proof of Theorem 3.2
The proof is very similar to the one of Lemma 1. Again, we recall the results of [12] for one-counter machines. We consider the following linear program:
[TABLE]
This linear program is feasible, and the maximal value of is equal to
[TABLE]
Moreover, we can assume that for all we have . Direct corollary of [12, Proposition 5,(B)] is the following Lemma:
Lemma 8
Let be a solution of the linear program above. If then .
Moreover, we use (similarly to the proof of Lemma 1) that the stochastic process where
[TABLE]
is a submartingale.
Now we use these results on one-dimensional VASS-MDPs to obtain the proof for -dimensional VASS-MDP by simply considering projections on one counter.
A trivial observation gives the following result.
Lemma 9
Let be a -dimensional VASS-MDP, corresponding one-dimensional VASS-MDPs obtained by projecting the labels onto respective coordinate. If at least one of the one-dimensional VASS-MDPs has linear angelic termination time, then has also linear angelic termination time (using the same strategy).
In order to obtain the result for at least quadratic termination, we use the Azuma inequality for all the submartingales obtained from the one-dimensional VASS-MDPs.
Let be the submartingale for , obtained from the corresponding linear program (and assuming all values are non-negative).
Given an initial configuration , the probability that the -th counter decreases below zero in steps can be bounded from above:
[TABLE]
where and are constants independent of .
Let and , then:
[TABLE]
Observe that there exists a suitable constant such that
[TABLE]
since and are constants (depending only on ). Therefore, taking , we obtain that the probability of some counter decreasing below zero is for some , and thus .
Let and , then
[TABLE]
This completes the proof of Theorem 3.2.
0.A.6 A proof of Theorem 4.1
We can divide the set of states of into the states belonging to some MEC and transient states. We rely on the two following facts:
For every strategy, the expected number of transitions from a transient state to some MEC state can be bounded by a constant (a number dependent only on and not the size of the initial configuration). 2. 2.
The asymptotic complexity of a MEC does not depend on the initial state.
First, we consider the demonic case. In DAG-like VASS-MDP, the only loops in the MEC decomposition are self-loops, i.e., once we leave MEC and visit a different MEC , we may never return to . Moreover, there is a probability such that for every MEC and every strategy, we revisit after leaving it (i.e., execute the self loop on ) with probability at most .
For the “if” direction, we assume that the initial state is in a MEC . We compute the upper bound on the expected number of transitions before terminating or arriving into another MEC. Let be the states of every MEC different from . We know that if MEC is linear then there exists such that all increments in satisfy . Let us consider a -labeled MDP obtained from by replacing each label by .
Now we construct a supermartingale similar to the one in the proof of Lemma 1. Again, for every , let and be functions assigning to every infinite path in initiated in the state , and the sum (where is the initial counter vector in ). Furthermore, let and be functions counting for every infinite path the number of transitions in the MEC and in the transient states before entering a MEC different from (a transition is in if both are in ).
Let be any initial state and arbitrary strategy. Then the following sequence of random variables is a supermartingale:
[TABLE]
where is sufficiently large constant. We want to compute the expected value of . For every we have:
[TABLE]
We know that and is bounded by a constant. Moreover, where is a constant depending only on . Using the property of a supermartingale, we obtain
[TABLE]
Since , we have for every that
[TABLE]
therefore for a suitable constant .
The time spent in one MEC can be used to increase some counters that can be used by MECs visited later. However, once we visit MEC , we never return to . We define the height of MEC to be the length of a longest path in the MEC decomposition from into a bottom MEC (if MEC can be visited from , then has higher height).
Let be the size of maximum counter change per transition.
We assume that for every MEC, the demonic termination complexity is bounded by for all . Let be the height of the MEC containing the initial state (or is such that is the height of a reachable MEC with the largest height). By induction on , we prove that the expected termination time is bounded by for sufficiently high.
If , then .
Assume that the height of MEC containing the initial configuration is and and the expected termination time for is bounded by (if we start in some transient state, the expected number of steps into a MEC with height at most is constant and the induction step holds). We divide every path where is the part of prior the arrival into the lower MEC. The expected length of is bounded by . Therefore, we have the following upper bound on the expected size of counters when arriving into the lower MEC: .
Using induction hypothesis, the expected length of is then which completes the proof.
For the “only if” direction, consider an initial state to be in a MEC with at least quadratic termination complexity. Using the corresponding strategy, we obtain the result.
Now we turn to the angelic case. If all bottom MECs are linear, there exists a strategy reaching one of the MECs in expected constant time. Therefore, the complexity is the same as in the bottom MEC, i.e., linear. If some of the bottom MECs is at least quadratic, then starting in that MEC, we obtain at least quadratic termination complexity.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] de Alfaro, L.: Formal verification of probabilistic systems. Phd. thesis, Stanford University, Stanford, CA, USA (1998)
- 2[2] Aminof, B., Rubin, S., Zuleger, F., Spegni, F.: Liveness of parameterized timed networks. In: Proceedings of ICALP 2015. pp. 375–387 (2015)
- 3[3] Atig, M.F., Habermehl, P.: On yen’s path logic for petri nets. International Journal of Foundations of Computer Science 22 (04), 783–799 (2011)
- 4[4] Baier, C., Katoen, J.P.: Principles of Model Checking (2008)
- 5[5] Barthe, G., Gaboardi, M., Grégoire, B., Hsu, J., Strub, P.Y.: Proving differential privacy via probabilistic couplings. In: Proceedings of LICS’16. pp. 749–758. ACM, New York, NY, USA (2016)
- 6[6] Bloem, R., Jacobs, S., Khalimov, A., Konnov, I., Rubin, S., Veith, H., Widder, J.: Decidability in parameterized verification. SIGACT News 47 (2), 53–64 (2016)
- 7[7] Bozzelli, L., Ganty, P.: Complexity analysis of the backward coverability algorithm for vass. In: Proceedings of RP 2011. pp. 96–109 (2011)
- 8[8] Brázdil, T., Brožek, V., Chatterjee, K., Forejt, V., Kučera, A.: Markov decision processes with multiple long-run average objectives 10 (1), 1–29 (2014)
