Markov Automata with Multiple Objectives
Tim Quatmann, Sebastian Junges, Joost-Pieter Katoen

TL;DR
This paper develops algorithms for analyzing multiple, possibly dependent objectives in Markov automata, enabling the approximation of Pareto curves and trade-offs in complex stochastic systems.
Contribution
It introduces methods to analyze several objectives simultaneously in Markov automata, including trade-offs and combined criteria, extending beyond single-objective verification.
Findings
Algorithms successfully approximate Pareto curves.
Approach handles multiple, dependent objectives.
Experimental results demonstrate scalability.
Abstract
Markov automata combine non-determinism, probabilistic branching, and exponentially distributed delays. This compositional variant of continuous-time Markov decision processes is used in reliability engineering, performance evaluation and stochastic scheduling. Their verification so far focused on single objectives such as (timed) reachability, and expected costs. In practice, often the objectives are mutually dependent and the aim is to reveal trade-offs. We present algorithms to analyze several objectives simultaneously and approximate Pareto curves. This includes, e.g., several (timed) reachability objectives, or various expected cost objectives. We also consider combinations thereof, such as on-time-within-budget objectives - which policies guarantee reaching a goal state within a deadline with at least probability while keeping the allowed average costs below a threshold? We…
| benchmark | ||||||||||
| N(-K) | #states | pts | time | pts | time | pts | time | pts | time | |
| job scheduling | ||||||||||
| 10-2 | 12 554 | 9 | 1.8 | 9 | 41 | 15 | 435 | 16 | 2 322 | |
| 44 | 128 | 21 | 834 | TO | TO | |||||
| 12-3 | 116 814 | 11 | 42 | 9 | 798 | 21 | 2 026 | TO | ||
| 53 | 323 | TO | TO | TO | ||||||
| 17-2 | 14 | 1 040 | TO | 22 | 4 936 | TO | ||||
| 58 | 2 692 | TO | TO | TO | ||||||
| polling | ||||||||||
| 3-2 | 1 020 | 4 | 0.3 | 5 | 0.6 | 3 | 130 | 12 | 669 | |
| 4 | 0.3 | 5 | 0.8 | 7 | 3 030 | TO | ||||
| 3-3 | 9 858 | 5 | 1.3 | 8 | 23 | 6 | 2 530 | TO | ||
| 6 | 2.0 | 19 | 3 199 | TO | TO | |||||
| 4-4 | 827 735 | 10 | 963 | 20 | 4 349 | TO | TO | |||
| 11 | 1 509 | TO | TO | TO | ||||||
| stream | ||||||||||
| 30 | 1 426 | 20 | 0.9 | 16 | 90 | 16 | 55 | 26 | 268 | |
| 51 | 8.8 | 46 | 2 686 | 38 | 1 341 | TO | ||||
| 250 | 94 376 | 31 | 50 | 15 | 5 830 | 16 | 4 050 | TO | ||
| 90 | 184 | TO | TO | TO | ||||||
| 1000 | 41 | 3 765 | TO | TO | TO | |||||
| TO | TO | TO | TO | |||||||
| mutex | ||||||||||
| 2 | 13 476 | 16 | 351 | 13 | 1 166 | |||||
| 13 | 2 739 | TO | ||||||||
| 3 | 38 453 | 15 | 2 333 | TO | ||||||
| N(-K) | #states | #choices | #transitions | #MS | ||
|---|---|---|---|---|---|---|
| jobs | 10-2 | 12 554 | 23 061 | 34 581 | 11 531 | 5.7 |
| 12-3 | 116 814 | 225 437 | 450 783 | 112 719 | 8.5 | |
| 17-2 | 4 587 537 | 8 912 931 | 13 369 379 | 4 456 466 | 5.9 | |
| polling | 3-2 | 1 020 | 1 852 | 2 477 | 508 | 14 |
| 3-3 | 9 858 | 18 295 | 24 536 | 4 801 | 14 | |
| 4-4 | 827 735 | 1 682 325 | 2 146 086 | 465 125 | 16 | |
| stream | 30 | 1 426 | 1 861 | 2 731 | 931 | 8 |
| 250 | 94 376 | 125 501 | 187 751 | 62 751 | 8 | |
| 1000 | 1 502 501 | 2 002 001 | 3 001 001 | 1 001 001 | 8 | |
| mutex | 2 | 13 476 | 31 752 | 36 120 | 216 | 2 |
| 3 | 38 453 | 99 132 | 111 687 | 8 487 | 3 |
| benchmark | PRISM | Storm | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| instance | #states | iter | verif | total | pts | iter | verif | total | ||
| consensus | 691 | 0.019 | 0.183 | 0.285 | 3 | 0.007 | 0.010 | 0.474 | ||
| 1 517 | 0.038 | 0.329 | 0.501 | 2 | 0.012 | 0.017 | 0.497 | |||
| 3 169 | 0.053 | 0.528 | 0.740 | 2 | 0.018 | 0.028 | 0.518 | |||
| 17 455 | 0.232 | 1.416 | 1.771 | 2 | 0.135 | 0.193 | 1.169 | |||
| 61 017 | 0.854 | 4.267 | 4.998 | 2 | 0.499 | 0.806 | 3.421 | |||
| 181 129 | 2.835 | 9.735 | 10.813 | 2 | 1.734 | 3.639 | 10.675 | |||
| zeroconf(-tb) | 5 449 | 0.130 | 6.157 | 6.423 | 2 | 0.077 | 0.146 | 0.830 | ||
| 10 543 | 0.235 | 12.093 | 12.428 | 2 | 0.213 | 0.368 | 1.178 | |||
| 17 221 | 0.408 | 22.143 | 22.596 | 2 | 0.467 | 0.819 | 1.454 | |||
| 29 572 | 0.285 | 45.715 | 46.311 | 2 | 0.615 | 1.926 | 2.924 | |||
| 19 670 | 0.262 | 40.259 | 40.780 | 2 | 0.568 | 1.256 | 2.052 | |||
| 42 968 | 0.363 | 96.813 | 97.631 | 1 | 2.706 | 6.216 | 7.469 | |||
| team-form. | 12 475 | incorrect | 5 | 0.160 | 0.257 | 0.877 | ||||
| 96 665 | incorrect | 3 | 1.360 | 6.637 | 9.325 | |||||
| 907 993 | incorrect | 3 | 22.197 | 866.151 | 889.889 | |||||
| 12 475 | not supported | 10 | 4.060 | 1.432 | 2.020 | |||||
| 96 665 | not supported | 13 | 1.327 | 9.447 | 12.256 | |||||
| 907 993 | not supported | 8 | 48.873 | 894.525 | 918.858 | |||||
| sched. | 31 965 | error | — | 1.214 | ||||||
| 633 735 | incorrect | — | 13.907 | |||||||
| 2 457 510 | incorrect | — | 53.119 | |||||||
| dpm | 636 | 0.187 | 0.228 | 0.298 | 6 | 0.143 | 0.145 | 0.355 | ||
| 636 | 0.213 | 0.247 | 0.312 | 4 | 0.210 | 0.213 | 0.433 | |||
| 636 | 0.239 | 0.285 | 0.360 | 3 | 0.205 | 0.207 | 0.433 | |||
| benchmark | IMCA | Storm (multi) | Storm (single) | |||
| instance | #states | verif. time | verif. time | verif. time | ||
| jobs | 10_2 | 12 554 | 0.009 | 0.047 | 0.021 | |
| 10_2 | 12 554 | 1.054 | 2.977 | 1.702 | ||
| 12_3 | 116 814 | 0.136 | 0.556 | 0.279 | ||
| 12_3 | 116 814 | 19.938 | 56.242 | 31.682 | ||
| polling | 3_3 | 9 858 | 6.254 | 0.102 | 0.095 | |
| 3_3 | 9 858 | 21.948 | 54.350 | 14.163 | ||
| 4_4 | 827 735 | 3 630.283 | 52.162 | 47.746 | ||
| 4_4 | 827 735 | 3 424.730 | 8 615.390 | 1 597.095 | ||
| stream | 30 | 1 426 | 0.005 | 0.009 | 0.004 | |
| 30 | 1 426 | 0.481 | 1.578 | 0.509 | ||
| 250 | 94 376 | 2.972 | 1.462 | 1.261 | ||
| 250 | 94 376 | 36.663 | 111.450 | 33.527 | ||
| mutex | 2 | 13 476 | 1.785 | 1.217 | 0.4 | |
| 2 | 13 476 | 6.922 | 4.118 | 1.008 | ||
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFormal Methods in Verification · Software Reliability and Analysis Research · Advanced Software Engineering Methodologies
11institutetext: RWTH Aachen University, Aachen, Germany
Markov Automata with Multiple Objectives
Tim Quatmann
Sebastian Junges
Joost-Pieter Katoen
Abstract
Markov automata combine non-determinism, probabilistic branching, and exponentially distributed delays. This compositional variant of continuous-time Markov decision processes is used in reliability engineering, performance evaluation and stochastic scheduling. Their verification so far focused on single objectives such as (timed) reachability, and expected costs. In practice, often the objectives are mutually dependent and the aim is to reveal trade-offs. We present algorithms to analyze several objectives simultaneously and approximate Pareto curves. This includes, e.g., several (timed) reachability objectives, or various expected cost objectives. We also consider combinations thereof, such as on-time-within-budget objectives—which policies guarantee reaching a goal state within a deadline with at least probability while keeping the allowed average costs below a threshold? We adopt existing approaches for classical Markov decision processes. The main challenge is to treat policies exploiting state residence times, even for untimed objectives. Experimental results show the feasibility and scalability of our approach.
1 Introduction
Markov automata [1, 2] extend labeled transition systems with probabilistic branching and exponentially distributed delays. They are a compositional variant of continuous-time Markov decision processes (CTMDPs), in a similar vein as Segala’s probabilistic automata extend classical MDPs. Transitions of a Markov automaton (MA) lead from states to probability distributions over states, and are either labeled with actions (allowing for interaction) or real numbers (rates of exponential distributions). MAs are used in reliability engineering [3], hardware design [4], data-flow computation [5], dependability [6] and performance evaluation [7], as MAs are a natural semantic framework for modeling formalisms such as AADL, dynamic fault trees, stochastic Petri nets, stochastic activity networks, SADF etc. The verification of MAs so far focused on single objectives such as reachability, timed reachability, expected costs, and long-run averages [8, 9, 10, 11, 12]. These analyses cannot treat objectives that are mutually influencing each other, like quickly reaching a target is more costly. The aim of this paper is to analyze multiple objectives on MAs at once and to facilitate trade–off analysis by approximating Pareto curves.
Consider the stochastic job scheduling problem of [13]: perform jobs with exponential service times on identical processors under a pre-emptive scheduling policy. Once a job finishes, all processors can be assigned any of the remaining jobs. When jobs are finished, this yields non-deterministic choices.
The largest-expected-service-time-first-policy is optimal to minimize the expected time to complete all jobs [13]. It is unclear how to schedule when imposing extra constraints, e.g., requiring a high probability to finish a batch of jobs within a tight deadline (to accelerate their post-processing), or having a low average waiting time. These multiple objectives involve non-trivial trade–offs. Our algorithms analyze such trade–offs. Fig. 1, e.g., shows the obtained result for 12 jobs and 3 processors. It approximates the set of points for schedules achieving that (1) the expected time to complete all jobs is at most and (2) the probability to finish half of the jobs within an hour is at least .
This paper presents techniques to verify MAs with multiple objectives. We consider multiple (un)timed reachability and expected reward objectives as well as their combinations. Put shortly, we reduce all these problems to instances of multi-objective verification problems on classical MDPs. For multi-objective queries involving (combinations of) untimed reachability and expected reward objectives, corresponding algorithms on the underlying MDP can be used. In this case, the MDP is simply obtained by ignoring the timing information, see Fig. 2(b). The crux is in relating MA schedulers—that can exploit state sojourn times to optimize their decisions—to MDP schedulers. For multiple timed reachability objectives, digitization [8, 9] is employed to obtain an MDP, see Fig. 2(c). The key is to mimic sojourn times by self-loops with appropriate probabilities. This provides a sound arbitrary close approximation of the timed behavior and also allows to combine timed reachability objectives with other types of objectives. The main contribution is to show that digitization is sound for all possible MA schedulers. This requires a new proof strategy as the existing ones are tailored to optimizing a single objective. All proofs can be found in the appendix. Experiments on instances of four MA benchmarks show encouraging results. Multiple untimed reachability and expected reward objectives can be efficiently treated for models with millions of states. As for single objectives [9], timed reachability is more expensive. Our implementation is competitive to PRISM for multi-objective MDPs [14, 15] and to IMCA [9] for single-objective MAs.
Related work.
Multi-objective decision making for MDPs with discounting and long-run objectives has been well investigated; for a recent survey, see [16]. Etessami et al. [17] consider verifying finite MDPs with multiple -regular objectives. Other multiple objectives include expected rewards under worst-case reachability [18, 19], quantiles and conditional probabilities [20], mean pay-offs and stability [21], long-run objectives [22, 23], total average discounted rewards under PCTL [24], and stochastic shortest path objectives [25]. This has been extended to MDPs with unknown cost function [26], infinite-state MDPs [27] arising from two-player timed games in a stochastic environment, and stochastic two-player games [28]. To the best of our knowledge, this is the first work on multi-objective MDPs extended with random timing.
2 Preliminaries
Notations.
The set of real numbers is denoted by , and we write and . For a finite set , denotes the set of probability distributions over . is Dirac if for some .
2.1 Models
Markov automata generalize both Markov decision processes (MDPs) and continuous time Markov chains (CTMCs). They are extended with rewards (or, equivalently, costs) to allow modelling, e.g., energy consumption.
Definition 1 (Markov automaton)
A Markov automaton (MA) is a tuple where is a finite set of states with initial state , is a finite set of actions with and ,
- •
{\rightarrow}\subseteq S\times(\mathit{Act}\mathbin{\mathchoice{\ooalign{\displaystyle\cup\cr\raise 0.55556pt\hbox{\set@color\displaystyle\cdot}}}{\ooalign{\textstyle\cup\cr\raise 0.55556pt\hbox{\set@color\textstyle\cdot}}}{\ooalign{\scriptstyle\cup\cr\raise 0.38889pt\hbox{\set@color\scriptstyle\cdot}}}{\ooalign{\scriptscriptstyle\cup\cr\raise 0.27779pt\hbox{\set@color\scriptscriptstyle\cdot}}}}\mathbb{R}_{>0})\times\mathit{Dist}(S) is a set of transitions such that for all there is at most one transition with , and
- •
with are reward functions \rho_{i}\colon S\mathbin{\mathchoice{\ooalign{\displaystyle\cup\cr\raise 0.55556pt\hbox{\set@color\displaystyle\cdot}}}{\ooalign{\textstyle\cup\cr\raise 0.55556pt\hbox{\set@color\textstyle\cdot}}}{\ooalign{\scriptstyle\cup\cr\raise 0.38889pt\hbox{\set@color\scriptstyle\cdot}}}{\ooalign{\scriptscriptstyle\cup\cr\raise 0.27779pt\hbox{\set@color\scriptscriptstyle\cdot}}}}(S\times\mathit{Act})\to\mathbb{R}_{\geq 0}.
In the remainder of the paper, let denote an MA. A transition , denoted by , is called probabilistic if and Markovian if . In the latter case, is the rate of an exponential distribution, modeling a time-delayed transition. Probabilistic transitions fire instantaneously. The successor state is determined by , i.e., we move to with probability . Probabilistic (Markovian) states PS (MS) have an outgoing probabilistic (Markovian) transition, respectively: and . The exit rate of is uniquely given by . The transition probabilities of are given by the function satisfying if either or ( and ) and in all other cases. The value corresponds to the probability to move from with action to . The enabled actions at state are given by .
Example 1
Fig. 2(a) shows an MA . We do not depict Dirac probability distributions. Markovian transitions are illustrated by dashed arrows.
We assume action-deterministic MAs: holds for all and . Terminal states are excluded by adding a Markovian self-loop. As standard for MAs [1, 2], we impose the maximal progress assumption, i.e., probabilistic transitions take precedence over Markovian ones. Thus, we remove transitions for and which yields S=\mathrm{PS}\mathbin{\mathchoice{\ooalign{\displaystyle\cup\cr\raise 0.55556pt\hbox{\set@color\displaystyle\cdot}}}{\ooalign{\textstyle\cup\cr\raise 0.55556pt\hbox{\set@color\textstyle\cdot}}}{\ooalign{\scriptstyle\cup\cr\raise 0.38889pt\hbox{\set@color\scriptstyle\cdot}}}{\ooalign{\scriptscriptstyle\cup\cr\raise 0.27779pt\hbox{\set@color\scriptscriptstyle\cdot}}}}\mathrm{MS}. MAs with Zeno behavior, where infinitely many actions can be taken within finite time with non-zero probability, are unrealistic and considered a modeling error.
A reward function defines state rewards and action rewards. When sojourning in a state for time units, the state reward is obtained. Upon taking a transition , we collect action reward (if ) or (if ). For presentation purposes, in the remainder of this section, rewards are omitted. Full definitions with rewards can be found in App. 0.A.1.
Definition 2 (Markov decision process [29])
A Markov decision process (MDP) is a tuple with as in Def. 1 and are the transition probabilities satisfying for all and .
MDPs are MAs without Markovian states and thus without timing aspects, i.e., MDPs exhibit probabilistic branching and non-determinism. Zeno behavior is not a concern, as we do not consider timing aspects. The underlying MDP of an MA abstracts away from its timing:
Definition 3 (Underlying MDP)
The MDP is the underlying MDP of MA with transition probabilities .
The digitization of w.r.t. some digitization constant is an MDP which digitizes the time [8, 9]. The main difference between and is that the latter also introduces self-loops which describe the probability to stay in a Markovian state for time units. More precisely, the outgoing transitions of states in represent that either (1) a Markovian transition in was taken within time units, or (2) no transition is taken within time units – which is captured by taking the self-loop in . Counting the taken self-loops at allows to approximate the sojourn time in .
Definition 4 (Digitization of an MA)
For MA with transition probabilities and digitization constant , the digitization of w.r.t. is the MDP where
[TABLE]
Example 2
Fig. 2 shows an MA with its underlying MDP and a digitization for unspecified .
Paths and schedulers.
Paths represent runs of starting in the initial state. Let and , if , and and , if .
Definition 5 (Infinite path)
An infinite path of MA with transition probabilities is an infinite sequence of states and stamps \kappa_{0},\kappa_{1},\dots\in\mathit{Act}\mathbin{\mathchoice{\ooalign{\displaystyle\cup\cr\raise 0.55556pt\hbox{\set@color\displaystyle\cdot}}}{\ooalign{\textstyle\cup\cr\raise 0.55556pt\hbox{\set@color\textstyle\cdot}}}{\ooalign{\scriptstyle\cup\cr\raise 0.38889pt\hbox{\set@color\scriptstyle\cdot}}}{\ooalign{\scriptscriptstyle\cup\cr\raise 0.27779pt\hbox{\set@color\scriptscriptstyle\cdot}}}}\mathbb{R}_{\geq 0} such that (1) , and for any it holds that (2) , (3) implies , and (4) implies .
An infix of a path represents that we stay at for time units and then perform action and move to state . Condition (1) excludes Zeno paths, condition (2) ensures positive transition probabilities, and conditions (3) and (4) assert that stamps match the transition type at .
A finite path is a finite prefix of an infinite path. The length of is , its last state is , and the time duration is . We denote the sets of finite and infinite paths of by and , respectively. The superscript is omitted if the model is clear from the context. For a finite or infinite path the prefix of of length is denoted by . The th state visited by is given by . The time-abstraction of removes all sojourn times and is a path of the underlying MDP : . Paths of are also referred to as the time-abstract paths of .
Definition 6 (Generic scheduler)
A generic scheduler for is a measurable function such that for each .
A scheduler for resolves the non-determinism of : is the probability to take transition after observing the run . The set of such schedulers is denoted by ( if is clear from the context). is deterministic if the distribution is Dirac for any . Time-abstract schedulers behave independently of the time-stamps of the given path, i.e., for all actions and paths with . We write to denote the set of time-abstract schedulers of . GM is the most general scheduler class for MAs. For MDPs, the most general scheduler class is .
2.2 Objectives
An objective is a representation of a quantitative property like the probability to reach an error state, or the expected energy consumption. To express Boolean properties (e.g., the probability to reach an error state is below ), is combined with a threshold where is a threshold relation and is a threshold value. Let denote that the MA under scheduler satisfies the property .
Reachability objectives.
is a time interval if it is of the form or , where . The set of paths reaching a set of goal states in time is defined as
[TABLE]
We write instead of . A probability measure on sets of infinite paths is defined, which generalizes both the standard probability measure on MDPs and on CTMCs. A formal definition is given in App. 0.A.2.1.
Definition 7 (Reachability objective)
A reachability objective has the form for time interval and goal states . The objective is timed if and untimed otherwise. For MA and scheduler , let iff .
Expected reward objectives.
Expected rewards define the expected amount of reward collected (w.r.t. ) until a goal state in is reached. This is a straightforward generalization of the notion on CTMCs and MDPs. A formal definition is found in App. 0.A.2.2.
Definition 8 (Expected reward objective)
An expected reward objective has the form where is the index of reward function and . For MA and scheduler , let iff .
Expected time objectives are expected reward objectives that consider the reward function with if and all other rewards are zero.
3 Multi-objective Model Checking
Standard model checking considers objectives individually. This approach is not feasible when we are interested in multiple objectives that should be fulfilled by the same scheduler, e.g., a scheduler that maximizes the expected profit might violate certain safety constraints. Multi-objective model checking aims to analyze multiple objectives at once and reveals possible trade-offs.
Definition 9 (Satisfaction of multiple objectives)
Let be an MA and . For objectives with threshold relations and threshold values let
[TABLE]
Furthermore, let such that .
If , the point is achievable in with scheduler . The set of achievable points of w.r.t. and is This definition is compatible with the notions on MDPs as given in [15, 17].
Example 3
Fig. 3(b) and Fig. 3(c) depict the set of achievable points of the MA from Fig. 3(a) w.r.t. relations and objectives and , respectively. Using the set of achievable points, we can answer Pareto, numerical, and achievability queries as considered in [15], e.g., the Pareto front lies on the border of the set.
Schedulers.
For single-objective model checking on MAs, it suffices to consider deterministic schedulers [30]. For untimed reachability and expected rewards even time-abstract deterministic schedulers suffice [30]. Multi-objective model checking on MDPs requires history-dependent, randomized schedulers [17]. On MAs, schedulers may also employ timing information to make optimal choices, even if only untimed objectives are considered.
Example 4
Consider the MA in Fig. 3(a) with untimed objectives and . A simple graph argument yields that both properties are only satisfied if action is taken with probability exactly a half. Thus, on the underlying MDP, no deterministic scheduler satisfies both objectives. On the MA however, paths can be distinguished by their sojourn time in . As the probability mass to stay in for at most is exactly , a timed scheduler with if and [math] otherwise does satisfy both objectives.
Theorem 3.1 ()
For some MA with , no deterministic time-abstract scheduler satisfies .
The geometric shape of the achievable points.
Like for MDPs [17], the set of achievable points of any combination of aforementioned objectives is convex.
Proposition 1 ()
The set is convex.
For MDPs, the set of achievable points is a convex polytope where the vertices can be realized by deterministic schedulers that use memory bounded by the number of objectives. As there are finitely many such schedulers, the polytope is finite [17], i.e., it can be represented by a finite number of vertices. This result does not carry over to MAs. For example, the achievable points of the MA from Fig. 3(a) together with the objectives form the infinite polytope shown in Fig. 3(c). The insight here is that for any sojourn time in , the timing information is relevant for optimal schedulers: The shorter the sojourn time in , the higher the probability to reach within the time bound.
Theorem 3.2 ()
For some MA and objectives , the polytope is not finite.
As infinite convex polytopes cannot be represented by a finite number of vertices, any method extending the approach of [15] – which computes these vertices – can only approximate the set of achievable points.
Problem statement.
For an MA and objectives with threshold relations, construct arbitrarily tight over- and under-approximations of the achievable points.
4 Analysis of Markov Automata with Multiple Objectives
The state-of-the-art in single-objective model checking of MA is to reduce the MA to an MDP, cf. [8, 9, 10], for which efficient algorithms exist. We aim to lift this approach to multi-objective model checking. Assume MA and objectives with threshold relations . We discuss how the set of achievable points of relates to the set of achievable points of an MDP. The key challenge is to deal with timing information—even for untimed objectives—and to consider schedulers beyond those optimizing single objectives. We obtain:
- •
For untimed reachability and expected reward objectives, the achievable points of equal those of its underlying MDP, cf. Theorems 4.1 and 4.2.
- •
For timed reachability objectives, the set of achievable points of a digitized MDP provides a sound approximation of the achievable points of , cf. Theorem 4.3. Corollary 1 gives the precision of the approximation.
4.1 Untimed Reachability Objectives
Although timing information is essential for deterministic schedulers, cf. Theorem 3.1, timing information does not strengthen randomized schedulers:
Theorem 4.1 ()
For MA and untimed reachability objectives it holds that
The main idea for proving Theorem 4.1 is to construct for scheduler a time-abstract scheduler such that they both induce the same untimed reachability probabilities. To this end, we discuss the connection between probabilities of paths of MA and paths of MDP .
Definition 10 (Induced paths of a time-abstract path)
The set of induced paths on MA of a path of is given by
[TABLE]
The set contains all paths of where replacing sojourn times by yields .
For , the probability distribution might depend on the sojourn times of the path . The time-abstract scheduler weights the distribution with the probability masses of the paths .
Definition 11 (Time-abstraction of a scheduler)
The time-abstraction of is defined as such that for any
[TABLE]
The term represents the probability for a path in to have sojourn times as given by . The value coincides with the probability that picks action , given that the time-abstract path was observed.
Example 5
Consider the MA in Fig. 2(a) and the scheduler choosing at state iff the sojourn time at is at most one. Then , the probability that is left within one time unit. For we have
[TABLE]
In the example, the considered scheduler and its time-abstraction induce the same untimed reachability probabilities. We generalize this observation.
Lemma 1 ()
For any we have
The result is lifted to untimed reachability probabilities.
Proposition 2 ()
For any it holds that
As the definition of is independent of the considered set of goal states , Proposition 2 can be lifted to multiple untimed reachability objectives.
Proof of Theorem 4.1 (sketch).
By applying Proposition 2, we can show that for any scheduler and untimed reachability objectives with thresholds . Theorem 4.1 is a direct consequence of this.
4.2 Expected Reward Objectives
The results for expected reward objectives are similar to untimed reachability objectives: An analysis of the underlying MDP suffices. We show the following extension of Theorem 4.1 to expected reward objectives.
Theorem 4.2 ()
For MA and untimed reachability and expected reward objectives :
To prove this, we show that a scheduler and its time-abstraction induce the same expected rewards on and , respectively. Theorem 4.2 follows then analogously to Theorem 4.1.
Proposition 3 ()
Let be some reward function of and let be its counterpart for . For we have
Notice that encodes the expected reward of obtained in a state by assuming the sojourn time to be the expected sojourn time . Although the claim is similar to Proposition 2, its proof cannot be adapted straightforwardly. In particular, the analogon to Lemma 1 does not hold: The expected reward collected along a time-abstract path does in general not coincide for and .
Example 6
We consider standard notations for rewards as detailed in App. 0.A.2.2. Let be the MA with underlying MDP as shown in Fig. 2. Let and zero otherwise. Reconsider the scheduler from Example 5. Let . The probability is zero since chooses on such paths. For the remaining paths in , action is chosen with probability one. The expected reward in along is:
[TABLE]
The expected reward in along differs as
[TABLE]
The intuition is as follows: If path of under occurs, we have since chose . Hence, the reward collected from paths in is at most . There is thus a dependency between the choice of the scheduler at and the collected reward at . This dependency is absent in as the reward at a state is independent of the subsequent performed actions.
Let . The expected reward along is for and for . As the rewards for and sum up to one in both and , the expected reward along all paths of length two coincides for and .
This observation can be generalized to arbitrary MA and paths of arbitrary length.
Proof of Proposition 3 (sketch).
For every , the expected reward collected along paths of length at most coincides for under and under . The proposition follows by letting approach infinity.
Thus, queries on MA with mixtures of untimed reachability and expected reward objectives can be analyzed on the underlying MDP .
4.3 Timed Reachability Objectives
Timed reachability objectives cannot be analyzed on as it abstracts away from sojourn times. We lift the digitization approach for single-objective timed reachability [8, 9] to multiple objectives. Instead of abstracting timing information, it is digitized. Let denote the digitization of for arbitrary digitization constant , see Def. 4. A time interval of the form or with and is called well-formed. For the remainder, we only consider well-formed intervals, ensured by an appropriate digitization constant. An interval for time-bounds is transformed to digitization step bounds . Let , we set .
We first relate paths in to paths in its digitization.
Definition 12 (Digitization of a path)
The digitization of path in is the path in given by
[TABLE]
where for each .
Example 7
For the path of the MA in Fig. 2(a) and , we get .
The in the definition above represent a digitization of the sojourn times such that . These digitized times are incorporated into the digitization of a path by taking the self-loop at state times. We also refer to the paths of as digital paths (of ). The number of digitization steps of a digital path is the number of transitions emerging from Markovian states, i.e., . One digitization step represents the elapse of at most time units—either by staying at some for time or by leaving within time. The number multiplied with yields an estimate for the duration . A digital path can be interpreted as representation of the set of paths of whose digitization is .
Definition 13 (Induced paths of a digital path)
The set of induced paths of a (finite or infinite) digital path of is
[TABLE]
For sets of digital paths we define the induced paths . To relate timed reachability probabilities for under scheduler with -bounded reachability probabilities for , relating to a scheduler for is necessary.
Definition 14 (Digitization of a scheduler)
The * digitization* of is given by such that for any with
[TABLE]
The digitization is similar to the time-abstraction as both schedulers get a path with restricted timing information as input and mimic the choice of . However, while receives no information regarding sojourn times, receives the digital estimate. Intuitively, considers for each , weighted with the probability that the sojourn times of a path in are as given by . The restriction asserts that does not end with a self-loop on a Markovian state, implying .
Example 8
Let MA in Fig. 2(a) and . Again, chooses at state iff the sojourn time at is at most one. Consider the digital paths . For we have . It follows . For it is unclear whether chooses or . Hence, randomly guesses:
[TABLE]
On we consider -bounded reachability instead of timed reachability.
Definition 15 (-bounded reachability)
The set of infinite digital paths that reach within the interval of consecutive natural numbers is
[TABLE]
The timed reachability probabilities for are estimated by -bounded reachability probabilities for . The induced -bounded reachability probability for (under ) coincides with -bounded reachability probability on (under ).
Proposition 4 ()
Let be an MA with , , and digitization . Further, let be a set of consecutive natural numbers. It holds that
[TABLE]
Thus, induced -bounded reachability on MAs can be computed on their digitization. Next, we relate -bounded and timed reachability on MAs, i.e., we quantify the maximum difference between time-bounded and -bounded reachability probabilities.
Example 9
Let be the MA given in Fig. 4(a). We consider the well-formed time interval , yielding digitization step bounds . The digitization constant remains unspecified in this example. Fig. 4(b) illustrates paths , , and of . We depict sojourn times by arrow length. A black dot indicates that the path stays at the current state for a multiple of time units. All depicted paths reach within time units. However, the digitizations of , , and reach within , , and digitization steps, respectively. This yields
[TABLE]
Let be the maximum exit rate of . For define
[TABLE]
and approach [math] for small digitization constants .
Proposition 5 ()
For MA , scheduler , goal states , digitization constant and time interval
[TABLE]
Proof (sketch).
The sets and are illustrated in Fig. 5. We have
[TABLE]
One then shows
[TABLE]
To this end, show for any that is an upper bound for the probability of paths that induce more then digitization steps within the the first time units. Then, this probability can be related to the probability of paths in and , respectively.
From Prop. 4 and Prop. 5, we immediately have Cor. 1, which ensures that the value can be approximated with arbitrary precision by computing for a sufficiently small .
Corollary 1 ()
For MA , scheduler , goal states , digitization constant and time interval
[TABLE]
This generalizes existing results [8, 9] that only consider schedulers which maximize (or minimize) the corresponding probabilities. More details are given in App. 0.F.
Next, we lift Cor. 1 to multiple objectives . We define the satisfaction of a timed reachability objective for the digitization as . This allows us to consider notations like , where contains one or more timed reachability objectives. For a point we consider the hyperrectangle
[TABLE]
and is defined similarly. The next example shows how the set of achievable points of can be approximated using achievable points of .
Example 10
Let be two timed reachability objectives for an MA with digitization such that , , , and . The blue rectangle in Fig. 6(a) illustrates the set for the point . Assume holds for threshold relations , i.e., is achievable for the digitization . From Cor. 1, we infer that contains at least one point that is achievable for . Hence, the bottom left corner point of the rectangle is achievable for . This holds for any rectangle with , where is the set of achievable points of denoted by the gray area111In the figure, partly overlaps , i.e., the green area also belongs to . in Fig. 6(b). It follows that any point in (depicted by the green area) is achievable for . On the other hand, an achievable point of has to be contained in a set for at least one . The red area depicts the points for which this is not the case, i.e., points that are not achievable for . The digitization constant controls the accuracy of the resulting approximation. Fig. 6(c) depicts a possible result when a smaller digitization constant is considered.
The observations from the example above are formalized in the following theorem. The theorem also covers unbounded reachability objectives by considering the time interval . For expected reward objectives of the form it can be shown that . This claim is similar to Proposition 3 and can be shown analogously. This enables multi-objective model checking of MAs with timed reachability objectives.
Theorem 4.3 ()
Let be an MA with digitization . Furthermore, let be (un)timed reachability or expected reward objectives with threshold relations and . It holds that with:
[TABLE]
5 Experimental Evaluation
Implementation.
We implemented multi-objective model checking of MAs into Storm [31]. The input model is given in the PRISM language222We slightly extend the PRISM language in order to describe MAs. and translated into a sparse representation. For MA , the implementation performs a multi-objective analysis on the underlying MDP or a digitization and infers (an approximation of) the achievable points of by exploiting the results from Sect. 4. For computing the achievable points of and , we apply the approach of [15]. It repeatedly checks weighted combinations of the objectives (by means of value iteration [29] – a standard technique in single-objective MDP model checking) to refine an approximation of the set of achievable points. This procedure is extended as follows. Full details can be found in [32].
- •
We support -bounded reachability objectives by combining the approach of [15] (which supports step-bounded reachability on MDPs) with techniques from single-objective MA analysis [8]. Roughly, we reduce -bounded reachability to untimed reachability by storing the digitized time-epoch (i.e., the current number of digitization steps) into the state space. A blow-up of the resulting model is avoided by considering each time-epoch separately.
- •
In contrast to [15], we allow a simultaneous analysis of minimizing and maximizing expected reward objectives. This is achieved by performing additional preprocessing steps that comprise an analysis of end components.
The source code including all material to reproduce the experiments is available at http://www.stormchecker.org/benchmarks.html.
Setup.
Our implementation uses a single core (2GHz) of a 48-core HP BL685C G7 limited to 20GB RAM. The timeout (TO) is two hours. For a model, a set of objectives, and a precision , we measure the time to compute an -approximation333An -approximation of is given by with and for all exists a such that the distance between and is at most . of the set of achievable points. This set-up coincides with Pareto queries as discussed in [15]. The digitization constant is chosen heuristically such that recalculations with smaller constants are avoided. We set the precision for value-iteration to . We use classical value iteration; the use of improved algorithms [33] is left for future work.
Results for MAs.
We consider four case studies: (i) a job scheduler [13], see Sect. 1; (ii) a polling system [34, 35] containing a server processing jobs that arrive at two stations; (iii) a video streaming client buffering received packages and deciding when to start playback; and (iv) a randomized mutual exclusion algorithm [35], a variant of [36] with a process-dependent random delay in the critical section. Details on the benchmarks and the objectives are given in App. 0.G.1.
Tab. 1 lists results. For each instance we give the defining constants, the number of states of the MA and the used -approximation. A multi-objective query is given by the triple indicating untimed, expected reward, and timed objectives. For each MA and query we depict the total run-time of our implementation (time) and the number of vertices of the obtained under-approximation (pts).
Queries analyzed on the underlying MDP are solved efficiently on large models with up to millions of states. For timed objectives the run-times increase drastically due to the costly analysis of digitized reachability objectives on the digitization, cf. [9]. Queries with up to four objectives can be dealt with within the time limit. Furthermore, for an approximation one order of magnitude better, the number of vertices of the result increases approximately by a factor three. In addition, a lower digitization constant has then to be considered which often leads to timeouts in experiments with timed objectives.
Comparison with PRISM [14] and IMCA [9].
We compared the performance of our implementation with both PRISM and IMCA. Verification times are summarized in Fig. 7: On points above the diagonal, our implementation is faster. For the comparison with PRISM (no MAs), we considered the multi-objective MDP benchmarks from [15, 18]. Both implementations are based on [15]. For the comparison with IMCA (no multi-objective queries) we used the benchmarks from Tab. 1, with just a single objective. We observe that our implementation is competitive. Details are given in App. 0.G.2 and App. 0.G.3.
6 Conclusion
We considered multi-objective verification of Markov automata, including in particular timed reachability objectives. The next step is to apply our algorithms to the manifold applications of MA, such as generalized stochastic Petri nets to enrich the analysis possibilities of such nets.
6.0.1 Acknowledgement.
This work was supported by the CDZ project CAP (GZ 1023).
Appendix 0.A Additional Preliminaries
0.A.1 Models with Rewards
We extend the models with rewards.
Definition 16 (Markov decision process [29])
A Markov decision process (MDP) is a tuple , where are as in Definition 1, are action reward functions , and is a transition probability function satisfying for all and .
The reward is collected when choosing action at state . Note that we do not consider state rewards for MDPs.
Definition 17 (Underlying MDP)
For MA with transition probabilities the underlying MDP of is given by , where for each
[TABLE]
The reward functions incorporate the action and state rewards of where the state rewards are multiplied with the expected sojourn times of states .
Definition 18 (Digitization of an MA)
For an MA with transition probabilities and a digitization constant , the digitization of w.r.t. is given by the MDP , where is as in Definition 4 and for each
[TABLE]
0.A.2 Measures
0.A.2.1 Probability measure.
Given a scheduler , the probability measure is defined for measurable sets of infinite paths of MA . This is achieved by considering the probability measure for transition steps. For a history with and a measurable set of transition steps we have
[TABLE]
is obtained by lifting to sequences of transition steps (i.e., paths). More information can be found in [37, 8]. To simplify the notations, we write instead of . For a set of finite paths we set = , where is the Cylinder of given by
[TABLE]
0.A.2.2 Expected reward.
We fix a reward function of the MA . The reward of a finite path is given by
[TABLE]
Intuitively, is the sum over the rewards obtained in every step depicted in the path . The reward obtained in step is composed of the state reward of multiplied with the sojourn time as well as the action reward given by and . State rewards assigned to probabilistic states do not affect the reward of a path as the sojourn time in such states is zero.
For an infinite path , the reward of up to a set of goal states is given by
[TABLE]
Intuitively, we stop collecting reward as soon as reaches a state in . If no state in is reached, reward is accumulated along the infinite path, which potentially yields an infinite reward. The expected reward is the expected value of the function , i.e.,
[TABLE]
Appendix 0.B Proofs About Sets of Achievable Points
0.B.1 Proof of Theorem 3.1
See 3.1
Proof
Consider the MA in Fig. 3(a) with objectives , relations , and point . We have (A scheduler achieving both objectives is given in Example 4). However, there are only two deterministic time abstract schedulers for :
[TABLE]
and it holds that and . ∎
0.B.2 Proof of Proposition 1
See 1
Proof
Let be an MA and let be objectives with relations and points such that and holds. For , let be a scheduler satisfying . Consider some . The point is achievable with the scheduler that makes an initial one-off random choice:
- •
with probability mimic and
- •
with probability mimic .
Hence, , implying that the set of achievable points is convex. ∎
0.B.3 Proof of Theorem 3.2
See 3.2
Proof
We show that the claim holds for the MA in Fig. 3(a) with objectives and relations .
For the sake of contradiction assume that the polytope is finite. Then, there must be two distinct vertices of such that is a face of . In particular, this means that is achievable but is not achievable for all . We show that there is in fact an for which is achievable, contradicting our assumption that is finite.
For , let be a scheduler satisfying . has to hold as the schedulers achieve different vertices of . The point is achievable with the randomized scheduler that mimics with probability 0.5 and mimics otherwise. Consider and the deterministic scheduler given by
[TABLE]
satisfies . Moreover, we have
[TABLE]
where the last inequality is due to . While the probability to reach is equal under both schedulers, is reached earlier when is considered. This increases the probability to reach in time, i.e., . It follows that for some . ∎
Appendix 0.C Proofs for Untimed Reachability
0.C.1 Proof of Lemma 1
See 1
Proof
The proof is by induction over the length of the considered path . Let and . If , then . Hence, . In the induction step, we assume that the lemma holds for a fixed path with length and . Consider the path .
Case :
It follows that
[TABLE]
Case :
As we have and it follows
[TABLE]
∎
0.C.2 Proof of Proposition 2
See 2
Proof
Let be the set of finite time-abstract paths of that end at the first visit of a state in , i.e.,
[TABLE]
Every path has a unique prefix with . We have
[TABLE]
The claim follows with Lemma 1 since
[TABLE]
∎
0.C.3 Proof of Theorem 4.1
See 4.1
Proof
Let be the considered list of objectives with threshold relations . The following equivalences hold for any and .
[TABLE]
Assume that holds, i.e., there is a such that . It follows that which means that holds as well. For the other direction assume , i.e., for some time-abstract scheduler . We have . It follows that . Applying the equivalences above yields and thus . ∎
Appendix 0.D Proofs for Expected Reward
0.D.1 Proof of Proposition 3
Let and . The set of time-abstract paths that end after steps or at the first visit of a state in is denoted by
[TABLE]
For under and under , we define the expected reward collected along the paths of as
[TABLE]
respectively. Intuitively, corresponds to assuming that no more reward is collected after the -th transition. It follows that the value approaches for large . Similarly, approaches for large . This observation is formalized by the following lemma.
Lemma 2 ()
For MA with , , and reward function it holds that
[TABLE]
Furthermore, any reward function for satisfies
[TABLE]
Proof
We show the first claim. The second claim follows analogously. For each , consider the function given by
[TABLE]
for every path . Intuitively, is the reward collected on within the first steps and only up to the first visit of . This allows us to express the expected reward collected along the paths of as
[TABLE]
It holds that which is a direct consequence from the definition of the reward of up to (cf. App. 0.A.2.2). Furthermore, note that the sequence of functions is non-decreasing, i.e., we have for all and . By applying the monotone convergence theorem [38] we obtain
[TABLE]
∎
The next step is to show that the expected reward collected along the paths of coincides for under and under .
Lemma 3 ()
Let be some reward function of and let be its counterpart for . Let be an MA with and . For all and it holds that
[TABLE]
Proof
The proof is by induction over the path length . To simplify the notation, we often omit the reward functions and and write, e.g., instead of or instead of .
If , then . The claim holds since .
In the induction step, we assume that the lemma is true for some fixed . We split the term into the reward that is obtained by paths which reach within steps and the reward obtained by paths of length . In a second step, we consider the sum of the reward collected within the first steps and the reward obtained in the -th step:
[TABLE]
where we define for paths with such that . The two terms (1) and (2) are treated separately.
Term (1):
Let be the paths in of length at most . We have and every path in visits a state in . Correspondingly, is the set of time-abstract paths of length that do not visit a state in . Hence, the paths in with length have a prefix in . The set is partitioned such that
[TABLE]
The reward obtained within the first steps is independent of the -th transition. To show this formally, we fix a path with and derive
[TABLE]
With the above-mentioned partition of the set , it follows that the expected reward obtained within the first steps is given by
[TABLE]
Term (2):
For the expected reward obtained in step , consider a path such that and .
- •
If , we have . It follows that
[TABLE]
- •
If , then follows similarly.
Combining the two results yields
[TABLE]
∎
We now show Proposition 3.
See 3
Proof
The proposition is a direct consequence of Lemma 2 and Lemma 3 as
[TABLE]
∎
0.D.2 Proof of Theorem 4.2
See 4.2
Proof
Let be the considered list of untimed reachability and expected reward objectives with threshold relations . The following equivalences hold for any and .
[TABLE]
where for the equivalence marked with we consider two cases: If is of the form , Proposition 2 yields
[TABLE]
Otherwise, is of the form and with Proposition 3 it follows that
[TABLE]
The remaining steps of the proof are completely analogous to the proof of Theorem 4.1 conducted on page 0.C.3. ∎
Appendix 0.E Proofs for Timed Reachability
0.E.1 Proof of Proposition 4
Let be an MA and let be the digitization of with respect to some . We consider the infinite paths of that are represented by a finite digital path.
Definition 19 (Induced cylinder of a digital path)
Given a digital path of MA , the induced cylinder of is given by
[TABLE]
Recall the definition of the cylinder of a set of finite paths (cf. App. 0.A.2.1). If does not end with a self-loop at a Markovian state, then holds.
Example 11
Let and be as in Fig. 2. We consider the path and digitization constant . The set contains each infinite path whose digitization has the prefix , i.e.,
[TABLE]
We observe that these are exactly the paths that have a prefix in . Put differently, we have .
Next, consider the digital path . Note that there is no path with , implying . Intuitively, depicts a sojourn time at but finite paths of MAs do not depict sojourn times at their last state. On the other hand, the induced cylinder of contains all paths that sojourn at least time units at , i.e.,
[TABLE]
The schedulers and induce the same probabilities for a given digital path. This is formalized by the following lemma. Note that a similar statement for and time-abstract paths was shown in Lemma 1.
Lemma 4
Let be an MA with scheduler , digitization , and digital path . It holds that
[TABLE]
Proof
The proof is by induction over the length of . Let and . If , then and . Hence, . In the induction step it is assumed that the lemma holds for a fixed path with and . Consider a path . We distinguish the following cases.
Case :
It follows that since ends with a probabilistic transition. Hence,
[TABLE]
Case :
As we have and it follows
[TABLE]
Assume that a path has been observed, i.e., holds for some . The term coincides with the probability that also holds. We have either
- •
which means that the transition from to has to be taken during a period of time units or
- •
where we additionally have to consider the case that no transition is taken at for time units.
It follows that
[TABLE]
We conclude that
[TABLE]
∎
We apply Lemma 4 to show Proposition 4. The idea of the proof is similar to the proof of Proposition 2 conducted on page 0.C.2. See 4
Proof
Consider the set of paths that (i) visit within digitization steps and (ii) do not have a proper prefix that satisfies (i). Every path in has a unique prefix in , yielding
[TABLE]
For the corresponding paths of we obtain
[TABLE]
The proposition follows with Lemma 4 since
[TABLE]
∎
0.E.2 Proof of Proposition 5
The notation for paths of is also applied to paths of , where for any . Intuitively, one digitization step represents the elapse of at most time units. Consequently, the duration of a path with digitization steps is at most .
Lemma 5
For a path and digitization constant it holds that
[TABLE]
Proof
Let and let for each (as in Definition 12). The number is given by . With it follows that
[TABLE]
∎
For a path and , the prefix of up to time point is given by . For the proof of Proposition 5, we focus on the probability that (under a given scheduler ) the digitization approach yields an inaccurate estimate of the actual time. This is the probability that more than digitization steps have been performed within time units. We denote this value by .
Definition 20 (Digitization step bounded paths)
Assume an MA and a digitization constant . For some , , and the set of paths whose prefix up to time point has digitization steps is defined as
[TABLE]
Example 12
Let be the MA given in Fig. 8(a). We consider the set . The digitization constant remains unspecified in this example. Fig. 8(b) illustrates paths , , and of . We depict sojourn times by arrow length. For instance, the path corresponds to . Digitization steps that are “earned” by sojourning at some state for a multiple of time units are indicated by black dots. Transitions of (where ) that do not belong to are depicted in gray. We obtain
[TABLE]
Note that only the digitization steps of the prefix up to time point are considered. For example, the step of at time point is not considered since the corresponding transition is not part of . However, we have , implying .
All considered paths reach within time units but requires more than digitization steps.
The following lemma gives an upper bound for the probability .
Lemma 6
Let be an MA with and maximum rate . Further, let and . It holds that
[TABLE]
For the proof of Lemma 6 we employ the following auxiliary lemma.
Lemma 7
Let be an MA with and maximum rate . For each , , and it holds that
[TABLE]
Proof
First, we show that the set corresponds to the paths of with the additional requirement that no transition is taken between the time points and , i.e.,
[TABLE]
- “”:
If , then follows immediately. Furthermore, assume towards a contradiction that there is a prefix of with . Then, (cf. Lemma 5). As , this means that which contradicts .
- “”:
For with no prefix such that , it holds that . Hence, and it follows that .
The probability for no transition to be taken between and only depends on the current state at time point . More precisely, for some state assume the set of paths . The probability that a path in this set takes no transition between time points and is given by . With for all it follows that
[TABLE]
∎
Proof (of Lemma 6)
Let . By induction over we show that
[TABLE]
The claim follows as .
For , we have iff takes no Markovian transition at time point zero. As this happens with probability one, it follows that
[TABLE]
We assume in the induction step that the proposition holds for some fixed . We distinguish between two cases for the initial state of .
Case :
We partition the set \#[{k\delta+\delta}]^{{\leq}{k+1}}=\Lambda^{\geq\delta}\mathbin{\mathchoice{\ooalign{\displaystyle\cup\cr\raise 0.55556pt\hbox{\set@color\displaystyle\cdot}}}{\ooalign{\textstyle\cup\cr\raise 0.55556pt\hbox{\set@color\textstyle\cdot}}}{\ooalign{\scriptstyle\cup\cr\raise 0.38889pt\hbox{\set@color\scriptstyle\cdot}}}{\ooalign{\scriptscriptstyle\cup\cr\raise 0.27779pt\hbox{\set@color\scriptscriptstyle\cdot}}}}\Lambda^{<\delta} with
[TABLE]
Hence, contains the paths where we wait at least time units at and contains the paths where the first transition is taken within time units. It follows that . We consider the probabilities for and separately.
- •
: For a path , after the first time units there are at most digitization steps within the next time units, i.e.,
[TABLE]
The probability for can therefore be derived from the probability to wait at for at least time units and the probability for . In order to apply this, we need to modify the considered scheduler as it might depend on the sojourn time in . Let be the scheduler for that mimics on paths where the first transition is delayed by , i.e., satisfies
[TABLE]
for all and . It holds that
[TABLE]
- •
: For a path , the first digitization step happens at less than time units, i.e., . It follows that there are at most digitization steps in the remaining time units, i.e.,
[TABLE]
where refers to the paths of , the MA obtained from by changing the initial state to . Hence, the probability for can be derived from the probability to take a transition from to some state within time units and the probability for . Again, we need to adapt the considered scheduler. Let with . The scheduler for mimics the scheduler for , where is prepended to the given path, i.e., we set
[TABLE]
for all and . With Lemma 7 it follows that
[TABLE]
Combining the results for and (i.e., Equations 8 and 9), we obtain
[TABLE]
where the inequality marked with is due to
[TABLE]
Case :
Since is non-zeno, a state is reached from within zero time almost surely (i.e., with probability one). From the previous case, it already follows that the Proposition holds for with and the set . With we obtain
[TABLE]
∎ We now present the proof of Proposition 5. See 5
Proof
In Section 4.3 we already discussed that
[TABLE]
The main part of the proof is to show that
[TABLE]
Then, the proposition follows directly. We show Equation 10 for the different forms of the time interval .
Case :
In this case we have . It follows that
[TABLE]
Hence,
[TABLE]
Case for :
We have .
- •
We show that which implies
[TABLE]
Let and let be the smallest prefix of with . It follows that is also the smallest prefix of with . Since , it follows that . From Lemma 5 we obtain
[TABLE]
Hence, the prefix reaches within time units, implying .
- •
Next, we show . With Lemma 6 we obtain
[TABLE]
Consider a path . Note that reaches within time units but with more than digitization steps. Hence, the prefix of up to time point certainly has more than digitization steps, i.e., satisfies which means .
Case for :
We have .
- •
We show that . With Lemma 6 we obtain
[TABLE]
Consider a path . As , it follows that has to reach (and leave) within less than time units. Let be the largest prefix of that satisfies . Our observations yield that leaves before time point . Hence, is a prefix of . Moreover, as . It follows that which implies .
- •
Now consider a path . visits at least once since . Moreover, does not visit after digitization steps due to . This means visits only finitely often. Let be the largest prefix of such that . Notice that holds. Let be the prefix of of length . We show by contradiction that holds:
- –
If , then is left before time point which contradicts .
- –
Further, assume that . With Lemma 5 we obtain
[TABLE]
Hence, stays at for at least time units which means that \mathrm{di}(\pi^{\prime})\big{(}{\xrightarrow{\bot}}\mathit{last}(\pi^{\prime})\big{)}^{j+1-{\lvert\pi^{\prime}\rvert_{\mathrm{ds}}}}=\bar{\pi} is a prefix of . Since , this contradicts .
We infer that takes at least one transition in the time interval . The probability for this can be upper bounded by , i.e.,
[TABLE]
Case for and :
We have .
- •
As in the case “”, we show that . With Lemma 6 we obtain
[TABLE]
Let and let be the largest prefix of with and . Such a prefix exists due to . reaches with at most digitization steps and therefore within at most time units (cf. Lemma 5). As , we conclude that has to reach (and leave) within less than time units. It follows that which implies .
- •
Next, let and let be the largest prefix of such that and . Such a prefix exists due to . We distinguish two cases.
- –
If , then since .
- –
If , then holds due to . Similar to the case “ we can show that takes at least one transition in time interval .
It follows that
[TABLE]
Hence,
[TABLE]
∎
0.E.3 Proof of Theorem 4.3
See 4.3
Proof
For simplicity, we assume that only the threshold relation is considered, i.e., . Furthermore, we restrict ourself to (un)timed reachability objectives. The remaining cases are treated analogously.
First assume a point . Consider the point satisfying for each index . It follows that and thus for some scheduler . Consider the scheduler given by for each path and action . Notice that . For an index let be the objective . It follows that
[TABLE]
With Corollary 1 it follows that
[TABLE]
As this observation holds for all objectives in , it follows that , implying .
The proof of the second inclusion is similar. Assume that holds for a point and a scheduler . For some index , consider . It follows that . With Corollary 1 we obtain
[TABLE]
Applying this for all objectives in yields , where the point satisfies or, equivalently, for each index . Note that which implies . ∎
Appendix 0.F Comparison to Single-objective Analysis
Corollary 1 generalizes existing results from single-objective timed reachability analysis: For MA , goal states , time bound , and digitization constant with , [9, Theorem 5.3] states that
[TABLE]
Corollary 1 generalizes this result by explicitly referring to the schedulers and under which the claim holds. This extension is necessary as a multi-objective analysis can not be restricted to schedulers that only optimize a single objective.
We remark that the proof in [9, Theorem 5.3] can not be adapted to show our result. The main reason is that the proof relies on an auxiliary lemma which claims that444We adapt [9, Lemma G.2] to our notations from Appendix 0.E.2.
[TABLE]
holds for all schedulers . We show that this claim does not hold. The intuition is as follows. Assume we observe that at most one Markovian transition is taken in within the first time units (i.e., we observe a path in ). The lemma claims that under this observation the probability to reach within time units does not increase. We give a counterexample to illustrate that there are schedulers for which this is not true. Consider the MA from Figure 9 and let be the scheduler for satisfying
[TABLE]
Hence, chooses iff there are less than two digitization steps within the first time units. It follows that the probability to reach on a path in is zero. We conclude that
[TABLE]
which contradicts Equation 11.
Appendix 0.G Further Details for the Experiments
0.G.1 Benchmark Details
We depict additional information regarding our experiments on multi-objective MAs.
Job scheduling.
The job scheduling case study originates from [13] and was already discussed in Section 1. We consider jobs that are executed on identical processors. Each of the jobs gets a different rate between 1 and 3. We consider the following objectives.
- :
Minimize the expected time until all jobs are completed.
- :
Minimize the expected time until jobs are completed.
- :
Minimize the expected waiting time of the jobs.
- :
Minimize the probability that the job with the lowest rate is completed before the job with the highest rate.
- :
Maximize the probability that all jobs are completed within time units.
- :
Maximize the probability that jobs are completed within time units.
The objectives have been combined as follows: ( refers to the objectives considered in Column of Table 1):
[TABLE]
Polling.
The polling system is based on [34, 35]. It considers two stations, each having a separate queue storing up to jobs of different types. The jobs arrive at Station (for ) with some rate as long as the queue of the station is not full. A server polls the two stations and processes the jobs by (nondeterministically) taking a job from a non-empty queue. The time for processing a job is given by a rate which depends on the type of the job. Erasing a job from a queue is unreliable, i.e., there is a chance that an already processed job stays in the queue. For we assume the following objectives:
- :
Maximize the expected number of processed jobs of Station until its queue is full.
- :
Minimize the expected sum of all waiting times of the jobs arriving at Station until the queue of Station is full.
- :
Minimize the probability that the queue of Station is full within two time units.
The objectives have been combined as follows: ( refers to the objectives considered in Column of Table 1):
[TABLE]
Stream.
This case study considers a client of a video streaming platform. The client consecutively receives data packages and stores them into a buffer. The buffered packages are processed during the playback of the video. The time it takes to receive (or to process) a single package is modeled by an exponentially distributed delay. Whenever a package is received and the video is not playing, the client nondeterministically chooses whether it starts the playback or whether it keeps on buffering. The latter choice is not reliable, i.e., there is a chance that the playback is started anyway. In case of a buffer underrun555A buffer underrun occurs when the next package needs to be processed while the buffer is empty., the playback is paused and the client waits for new packages to arrive. We analyzed the following objectives:
- :
Minimize the expected buffering time until the playback is finished.
- :
Minimize the expected number of buffer underruns during the playback.
- :
Minimize the expected time to start the playback.
- :
Minimize the probability for a buffer underrun within 2 time units.
- :
Maximize the probability that the playback starts within 0.5 time units.
The objectives have been combined as follows: ( refers to the objectives considered in Column of Table 1):
[TABLE]
Mutex.
This case study regards a randomized mutual exclusion protocol based on [36, 35]. Three processes nondeterministically choose a job for which they need to enter the critical section. The amount of time a process spends in its critical section is given by a rate which depends on the chosen job. There are different types of jobs. For each the following objective are considered:
- :
Maximize the probability that Process enters its critical section within 0.5 time units.
- :
Maximize the probability that Process enters its critical section within 1 time unit.
The objectives have been combined as follows: ( refers to the objectives considered in Column of Table 1):
[TABLE]
0.G.2 Comparison with PRISM
We considered PRISM 4.3.1 obtained from its website www.prismmodelchecker.org. We conducted our experiments on PRISM with both variants of the value iteration-based implementation (standard and Gauss-Seidel) and chose the faster variant for each benchmark instance. For all experiments the approximation precision was considered.
The detailed results are given in Table 3. We depict the different benchmark instances with the number of states of the MDP (Column #states) and the considered combination of objectives ( represents an (untimed) probabilistic objective, an expected reward objective, and a step-bounded reward objective). Column iter lists the time required for the iterative exploration of the set of achievable points as described in [15]. In Column verif we depict the verification time – including the time for the iterations as well as the conducted preprocessing steps. Column total indicates the total runtime of the tool which includes model building time and verification time. For our implementation, we also list the number of vertices of the obtained under-approximation (Column pts).
During our experiments we observed some issues considering the implementation in PRISM. For example PRISM does not detect that both objectives considered for the sched.-instances yield infinite rewards under every possible resolution of non-determinism. Instead of that, PRISM gives an incorrect answer.
0.G.3 Comparison with IMCA
We consider IMCA 1.6 obtained from https://github.com/buschko/imca. The experiments on IMCA have been conducted with and without enabling value-iteration and we chose the faster variant for each benchmark instance. For timed reachability objectives, the precision was considered in all experiments.
The resulting verification times are given in Table 4. We depict the different benchmark instances with the number of states of the MA (Column #states) and the considered objective (as discussed in App. 0.G.1). Besides the run-times of IMCA, we depict the run-times of our implementation (effectively performing multi-objective model checking with only one objective) in Column Storm (multi). Column Storm (single) shows the run-times obtained when Storm is invoked with standard (single-objective) model checking methods.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Eisentraut, C., Hermanns, H., Zhang, L.: On probabilistic automata in continuous time. In: Proc. of LICS, IEEE CS (2010) 342–351
- 2[2] Deng, Y., Hennessy, M.: On the semantics of Markov automata. Inf. Comput. 222 (2013) 139–168
- 3[3] Boudali, H., Crouzen, P., Stoelinga, M.: A rigorous, compositional, and extensible framework for dynamic fault tree analysis. IEEE Trans. Dependable Sec. Comput. 7 (2) (2010) 128–143
- 4[4] Coste, N., Hermanns, H., Lantreibecq, E., Serwe, W.: Towards performance prediction of compositional models in industrial GALS designs. In: Proc. of CAV. Vol. 5643 LNCS, Springer (2009) 204–218
- 5[5] Katoen, J.P., Wu, H.: Probabilistic model checking for uncertain scenario-aware data flow. ACM Trans. Embedded Comput. Sys. 22 (1) (2016) 15:1–15:27
- 6[6] Bozzano, M., Cimatti, A., Katoen, J.P., Nguyen, V.Y., Noll, T., Roveri, M.: Safety, dependability and performance analysis of extended AADL models. Comput. J. 54 (5) (2011) 754–775
- 7[7] Eisentraut, C., Hermanns, H., Katoen, J.P., Zhang, L.: A semantics for every GSPN. In: Petri Nets. Vol. 7927 LNCS, Springer (2013) 90–109
- 8[8] Hatefi, H., Hermanns, H.: Model checking algorithms for Markov automata. ECEASST 53 (2012)
