Multilevel Monte Carlo Method for Statistical Model Checking of Hybrid   Systems

Sadegh Esmaeil Zadeh Soudjani; Rupak Majumdar; Tigran Nagapetyan

arXiv:1706.08270·cs.SY·June 27, 2017

Multilevel Monte Carlo Method for Statistical Model Checking of Hybrid Systems

Sadegh Esmaeil Zadeh Soudjani, Rupak Majumdar, Tigran Nagapetyan

PDF

Open Access

TL;DR

This paper introduces a multilevel Monte Carlo approach for statistical model checking of stochastic hybrid systems, effectively estimating properties like reachability despite simulation challenges.

Contribution

It develops a novel MLMC-based method with smoothing and adaptive error balancing for verifying continuous-time stochastic hybrid systems.

Findings

01

Effective estimation of reachability probabilities in hybrid systems

02

Quantified error bounds for the MLMC approach

03

Successful application to thermostatically controlled loads model

Abstract

We study statistical model checking of continuous-time stochastic hybrid systems. The challenge in applying statistical model checking to these systems is that one cannot simulate such systems exactly. We employ the multilevel Monte Carlo method (MLMC) and work on a sequence of discrete-time stochastic processes whose executions approximate and converge weakly to that of the original continuous-time stochastic hybrid system with respect to satisfaction of the property of interest. With focus on bounded-horizon reachability, we recast the model checking problem as the computation of the distribution of the exit time, which is in turn formulated as the expectation of an indicator function. This latter computation involves estimating discontinuous functionals, which reduces the bound on the convergence rate of the Monte Carlo algorithm. We propose a smoothing step with tunable precision…

Figures3

Click any figure to enlarge with its caption.

Tables1

Table 1. Table 1: Parameters of a residential air conditioner as a TCL [ 16 ] modeled in ( 4 )-( 5 ).

Param.	Interpretation	Value
$θ_{s}$	set-point	$20 [^{\circ} C]$
$δ_{d}$	dead-band width	$0.5 [^{\circ} C]$
$θ_{a}$	ambient temperature	$32 [^{\circ} C]$
$P_{r a t e}$	power	$14 [k W]$

Equations84

X = {(q, z) ∣ q \in Q, z \in X (q)} .

X = {(q, z) ∣ q \in Q, z \in X (q)} .

d z (t) = b (q, z (t)) d t + σ (q, z (t)) d W_{t},

d z (t) = b (q, z (t)) d t + σ (q, z (t)) d W_{t},

t^{*} (q) := in f {t \in R_{> 0} \cup {\infty}, such that z^{q} (t) \in \partial X (q)} .

t^{*} (q) := in f {t \in R_{> 0} \cup {\infty}, such that z^{q} (t) \in \partial X (q)} .

d z (t) = b (q (T_{k}), z (t)) d t + σ (q (T_{k}), z (t)) d W_{t},

d z (t) = b (q (T_{k}), z (t)) d t + σ (q (T_{k}), z (t)) d W_{t},

d θ (t) = \frac{1}{C R} (θ_{a} - q (t) R P_{r a t e} - θ (t)) d t + σ (q (t)) d W_{t},

d θ (t) = \frac{1}{C R} (θ_{a} - q (t) R P_{r a t e} - θ (t)) d t + σ (q (t)) d W_{t},

f(q,\theta)=\left\{\begin{array}[]{ll}0,&\theta\leq\theta_{s}-\delta_{d}/2=:\theta_{-}\\ 1,&\theta\geq\theta_{s}+\delta_{d}/2=:\theta_{+}\\ q,&\text{else,}\end{array}\right.

f(q,\theta)=\left\{\begin{array}[]{ll}0,&\theta\leq\theta_{s}-\delta_{d}/2=:\theta_{-}\\ 1,&\theta\geq\theta_{s}+\delta_{d}/2=:\theta_{+}\\ q,&\text{else,}\end{array}\right.

P (H is safe over [0, s]) = P (Y > s) = 1 - F_{Y} (s)

P (H is safe over [0, s]) = P (Y > s) = 1 - F_{Y} (s)

F_{Y} (s) = E (1_{(- \infty, s]} (Y)) .

F_{Y} (s) = E (1_{(- \infty, s]} (Y)) .

E P = E g (Y),

E P = E g (Y),

z_{aux} = z_{k} + b (q_{k}, z_{k}) Δ + σ (q_{k}, z_{k}) Δ W_{k}

z_{aux} = z_{k} + b (q_{k}, z_{k}) Δ + σ (q_{k}, z_{k}) Δ W_{k}

\hat{P} = \frac{1}{N} i = 1 \sum N g (θ_{i}^{ℓ}),

\hat{P} = \frac{1}{N} i = 1 \sum N g (θ_{i}^{ℓ}),

MSE (A_{ℓ}) \equiv E [(\hat{P} - E P)^{2}] = E [(\hat{P} - E \hat{P})^{2}] + [E \hat{P} - E P]^{2} .

MSE (A_{ℓ}) \equiv E [(\hat{P} - E P)^{2}] = E [(\hat{P} - E \hat{P})^{2}] + [E \hat{P} - E P]^{2} .

Var \hat{P} = Var (\frac{1}{N} i = 1 \sum N g (θ_{i}^{ℓ})) = \frac{1}{N ^{2}} Var (i = 1 \sum N g (θ_{i}^{ℓ})) = \frac{1}{N} Var (g (θ^{ℓ})) .

Var \hat{P} = Var (\frac{1}{N} i = 1 \sum N g (θ_{i}^{ℓ})) = \frac{1}{N ^{2}} Var (i = 1 \sum N g (θ_{i}^{ℓ})) = \frac{1}{N} Var (g (θ^{ℓ})) .

C_{ℓ} (A_{ℓ}) := E [#operations and random number generations to calculate g (θ^{ℓ})],

C_{ℓ} (A_{ℓ}) := E [#operations and random number generations to calculate g (θ^{ℓ})],

C_{ℓ} (A_{ℓ}) \leq c \cdot (M S E (A_{ℓ}))^{- γ} \cdot (- lo g M S E (A_{ℓ}))^{η} .

C_{ℓ} (A_{ℓ}) \leq c \cdot (M S E (A_{ℓ}))^{- γ} \cdot (- lo g M S E (A_{ℓ}))^{η} .

E [g (θ^{ℓ}) - g (Y)] \leq c_{1} 2^{- α \cdot ℓ}, E [C_{ℓ}] \leq c_{2} 2^{ζ \cdot ℓ}, and Var g (θ^{ℓ}) < \infty.

E [g (θ^{ℓ}) - g (Y)] \leq c_{1} 2^{- α \cdot ℓ}, E [C_{ℓ}] \leq c_{2} 2^{ζ \cdot ℓ}, and Var g (θ^{ℓ}) < \infty.

θ_{k + 1}^{ℓ} = \frac{1}{C R} (θ_{a} - q_{k}^{ℓ} R P_{r a t e} - θ_{k}^{ℓ}) Δ + σ (q_{k}^{ℓ}) \cdot Δ \cdot W_{k}^{ℓ},

θ_{k + 1}^{ℓ} = \frac{1}{C R} (θ_{a} - q_{k}^{ℓ} R P_{r a t e} - θ_{k}^{ℓ}) Δ + σ (q_{k}^{ℓ}) \cdot Δ \cdot W_{k}^{ℓ},

E g (θ^{L}) = E g (θ^{0}) + l = 1 \sum L E [g (θ^{ℓ}) - g (θ^{ℓ - 1})] .

E g (θ^{L}) = E g (θ^{0}) + l = 1 \sum L E [g (θ^{ℓ}) - g (θ^{ℓ - 1})] .

\hat{P} = ℓ = 0 \sum L P^{ℓ},

\hat{P} = ℓ = 0 \sum L P^{ℓ},

P^{0} = \frac{1}{N _{0}} i = 1 \sum N_{0} g (θ_{i}^{0}), P^{ℓ} = \frac{1}{N _{ℓ}} i = 1 \sum N_{ℓ} [g (θ_{i}^{ℓ}) - g (θ_{i}^{ℓ - 1})], ℓ = 1, \dots, L .

P^{0} = \frac{1}{N _{0}} i = 1 \sum N_{0} g (θ_{i}^{0}), P^{ℓ} = \frac{1}{N _{ℓ}} i = 1 \sum N_{ℓ} [g (θ_{i}^{ℓ}) - g (θ_{i}^{ℓ - 1})], ℓ = 1, \dots, L .

Var \hat{P} = Var [ℓ = 0 \sum L P^{ℓ}] = ℓ = 0 \sum L Var P^{ℓ}, E P - E \hat{P} = E P - E [ℓ = 0 \sum L P^{ℓ}] = E P - E g (θ^{L}) .

Var \hat{P} = Var [ℓ = 0 \sum L P^{ℓ}] = ℓ = 0 \sum L Var P^{ℓ}, E P - E \hat{P} = E P - E [ℓ = 0 \sum L P^{ℓ}] = E P - E g (θ^{L}) .

E [g (θ^{ℓ}) - g (Y)] \leq c_{1} 2^{- α ℓ} and E [C_{ℓ}] \leq c_{2} 2^{ζ ℓ}

E [g (θ^{ℓ}) - g (Y)] \leq c_{1} 2^{- α ℓ} and E [C_{ℓ}] \leq c_{2} 2^{ζ ℓ}

\displaystyle\mathbb{E}[P^{\ell}]\ =\left\{\begin{array}[]{ll}\mathbb{E}[g(\theta^{0})],&~{}~{}\ell=0\\[7.22743pt] \mathbb{E}[g(\theta^{\ell})\!-\!g(\theta^{\ell-1})],&~{}~{}\ell>0\end{array}\right.\quad\text{ and }\quad\operatorname{Var}[P^{\ell}]\ \leq\ c_{3}\,N_{\ell}^{-1}\,2^{-\beta\,\ell}

θ_{k + 1}^{ℓ, f} = \frac{1}{C R} (θ_{a} - q_{k}^{ℓ, f} R P_{r a t e} - θ_{k}^{ℓ, f}) Δ_{f} + σ (q_{k}^{ℓ, f}) \cdot Δ_{f} \cdot W_{k}^{ℓ},

θ_{k + 1}^{ℓ, f} = \frac{1}{C R} (θ_{a} - q_{k}^{ℓ, f} R P_{r a t e} - θ_{k}^{ℓ, f}) Δ_{f} + σ (q_{k}^{ℓ, f}) \cdot Δ_{f} \cdot W_{k}^{ℓ},

q_{k + 1}^{ℓ, f} := f (q_{k}^{ℓ, f}, θ_{k}^{ℓ, f}), for all k = 0, 1, \dots, n_{f},

θ_{k + 1}^{ℓ, c} = θ_{k}^{ℓ, c} + \frac{1}{C R} (θ_{a} - q_{k}^{ℓ, c} R P_{r a t e} - θ_{k}^{ℓ, c}) Δ_{c} + σ (q_{k}^{ℓ, c}) \cdot Δ_{c} \cdot \frac{1}{2} \cdot (W_{2 k - 1}^{ℓ} + W_{2 k}^{ℓ}),

θ_{k + 1}^{ℓ, c} = θ_{k}^{ℓ, c} + \frac{1}{C R} (θ_{a} - q_{k}^{ℓ, c} R P_{r a t e} - θ_{k}^{ℓ, c}) Δ_{c} + σ (q_{k}^{ℓ, c}) \cdot Δ_{c} \cdot \frac{1}{2} \cdot (W_{2 k - 1}^{ℓ} + W_{2 k}^{ℓ}),

q_{k + 1}^{ℓ, c} := f (q_{k}^{ℓ, c}, θ_{k}^{ℓ, c}), for all k = 0, 1, \dots, n_{c},

g^{0} (x) = ⎩ ⎨ ⎧ 0, \frac{1}{2} + \frac{1}{8} (5 x^{3} - 9 x), 1, x > 1 - 1 \leq x \leq 1 x < - 1, and g^{δ} (x) = g^{0} ((x - s) / δ), x \in R .

g^{0} (x) = ⎩ ⎨ ⎧ 0, \frac{1}{2} + \frac{1}{8} (5 x^{3} - 9 x), 1, x > 1 - 1 \leq x \leq 1 x < - 1, and g^{δ} (x) = g^{0} ((x - s) / δ), x \in R .

M_{S}^{δ, L} = \frac{1}{N _{0}} \cdot i = 1 \sum N_{0} g^{δ} (θ_{i}^{0}) + ℓ = 1 \sum L \frac{1}{N _{ℓ}} \cdot i = 1 \sum N_{ℓ} (g^{δ} (θ_{i}^{ℓ, f}) - g^{δ} (θ_{i}^{ℓ, c})),

M_{S}^{δ, L} = \frac{1}{N _{0}} \cdot i = 1 \sum N_{0} g^{δ} (θ_{i}^{0}) + ℓ = 1 \sum L \frac{1}{N _{ℓ}} \cdot i = 1 \sum N_{ℓ} (g^{δ} (θ_{i}^{ℓ, f}) - g^{δ} (θ_{i}^{ℓ, c})),

M S E (M_{S}^{δ, L})

M S E (M_{S}^{δ, L})

\displaystyle\leq\delta^{4}+\bigl{|}\mathbb{E}(g^{\delta}(Y))-\mathbb{E}(g^{\delta}(\theta^{L}))\bigr{|}^{2}+\operatorname{Var}({\mathcal{M}}^{\delta,L}_{\mathfrak{S}})=:e_{1}^{2}+e_{2}^{2}+e_{3}.

e_{1} \leq a_{1} ε_{*}, e_{2} \leq a_{2} \cdot ε_{*}, e_{3} \leq a_{3}^{2} \cdot ε_{*}^{2}, where ε_{*} := \frac{ε}{a _{1} + a _{2} + a _{3}} .

e_{1} \leq a_{1} ε_{*}, e_{2} \leq a_{2} \cdot ε_{*}, e_{3} \leq a_{3}^{2} \cdot ε_{*}^{2}, where ε_{*} := \frac{ε}{a _{1} + a _{2} + a _{3}} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFormal Methods in Verification · Simulation Techniques and Applications · Advanced Control Systems Optimization

Full text

11institutetext: Max Planck Institute for Software Systems, Kaiserslautern, Germany

11email: {Sadegh,Rupak}@mpi-sws.org 22institutetext: Department of Statistics, University of Oxford, United Kingdom

22email: [email protected]

Authors’ Instructions

Multilevel Monte Carlo Method for Statistical

Model Checking of Hybrid Systems

Sadegh Esmaeil Zadeh Soudjani 11

Rupak Majumdar 11

Tigran Nagapetyan 22

Abstract

We study statistical model checking of continuous-time stochastic hybrid systems. The challenge in applying statistical model checking to these systems is that one cannot simulate such systems exactly. We employ the multilevel Monte Carlo method (MLMC) and work on a sequence of discrete-time stochastic processes whose executions approximate and converge weakly to that of the original continuous-time stochastic hybrid system with respect to satisfaction of the property of interest. With focus on bounded-horizon reachability, we recast the model checking problem as the computation of the distribution of the exit time, which is in turn formulated as the expectation of an indicator function. This latter computation involves estimating discontinuous functionals, which reduces the bound on the convergence rate of the Monte Carlo algorithm. We propose a smoothing step with tunable precision and formally quantify the error of the MLMC approach in the mean-square sense, which is composed of smoothing error, bias, and variance. We formulate a general adaptive algorithm which balances these error terms. Finally, we describe an application of our technique to verify a model of thermostatically controlled loads.

Keywords:

statistical model checking, formal verification, hybrid systems, continuous-time stochastic processes, multilevel Monte Carlo, reachability analysis

1 Introduction

Continuous-time stochastic hybrid systems (ct-SHS) are a natural model for cyber-physical systems operating under uncertainty [6, 8]. A ct-SHS has a hybrid state space consisting of discrete modes and, for each mode, a set of continuous states (called the invariant). In each mode, the continuous state evolves according to a stochastic differential equation (SDE) in continuous time. Transition from one discrete mode to another may be activated in two ways. The continuous state may hit the boundary of the invariant and make a forced transition according to a discrete stochastic transition kernel. Alternatively, the process may spontaneously change its discrete mode according to a continuous-time Markov chain whose rates depend on the hybrid state.

We consider quantitative analysis of temporal properties of ct-SHS [4, 3]. The fundamental analysis problem, called probabilistic reachability, consists in computing the probability that the state of a ct-SHS exits a given safe set within a given bounded time horizon. Since analytic solutions are not available, there are two common approaches. The first approach is numerical model checking that relies on the exact or approximate computation of the measure of the executions satisfying the temporal property. The second approach, called statistical model checking, relies on finitely many sample executions of the system, and employs hypothesis testing to provide confidence intervals for the estimate of the probability.

Statistical model checking has proven to be computationally more efficient than numerical model checking as it only requires the system to be executable. Thus, it can be applied to larger classes of systems and of specifications [25]. The main underlying assumption in all statistical model checking techniques is the ability to sample from the space of executions of the system. Unfortunately, we cannot compute exact simulations for the general class of ct-SHS due to the process evolution being continuous in both time and space. In this paper, we describe a statistical model checking approach to ct-SHS using the multilevel Monte Carlo (MLMC) method [18, 21], which does not require exact executions of the system.

Our procedure works as follows. First, we formulate the quantitative analysis problem as computing the distribution of the first exit time of the system from the given safe set. Then, we build a sequence of approximate models whose executions converge weakly (or in expectation) to the execution of the concrete system. Although these approximate models can be used separately in the classical setting of statistical model checking in order to compute estimates of the exit time, the MLMC method can take advantage of coupling between approximate executions with different time resolutions to provide better convergence rates.

An important challenge in applying the MLMC technique to the quantitative analysis of ct-SHS is that a discontinuous function is applied to the first exit time. While MLMC can be applied to discontinuous functions, the convergence rates we can guarantee are poor. We propose a smoothing step that replaces the discontinuous function with a continuous approximation and show that the replacement decreases the overall computation cost. Finally, we analyze the asymptotic computational cost of the MLMC approach for a given error bound. We propose an adaptive algorithm which balances errors due to bias, variance, and smoothing, and which tunes the hyperparameters of the algorithm on the fly.

We illustrate our technique on an example model of thermostatically controlled loads.

Related work. Formal definitions of various classes of continuous-time probabilistic hybrid models are presented in [28], together with a comparison. Over such models, [5] has formalized the notion of probabilistic reachability, [29] has proposed a computational technique based on convex optimization, [14] has provided discretization techniques with formal error bounds, and [15] has developed an approach based on satisfiability modulo theory. An alternative approach towards formal, finite approximations of continuous-time stochastic models is discussed in [35] and extended in [34] to switching diffusions. These approaches generally suffer from curse of dimensionality and are not applicable to large dimensional models.

For discrete-time stochastic hybrid models probabilistic reachability (and safety) has been fully characterized in [2] and computed via software tools [12, 11] that use finite abstractions. The methods can be extended to more general probabilistic temporal logics [30]. These techniques assume discrete-time dynamics and cannot be extended to ct-SHS.

An overview of statistical model checking techniques can be found in [25, 24, 23]. The paper [10] employs statistical model checking for verifying unbounded temporal properties. The paper [9] has discussed the use of importance sampling to address the issue of rare events in statistical verification of cyber-physical systems. A distributed implementation of statistical model checking is proposed in [7] and a set-oriented method for statistical verification of dynamical systems is presented in [31].

Employing multigrid ideas to reduce the computational complexity (in terms of expected number of arithmetic operations) of estimating an expected value using Monte Carlo path simulations is initially proposed in [18] in the context of stochastic differential equations. MLMC has a better asymptotic complexity and by its nature allows to build consecutive approximations, which can balance the bias and variance. The general paradigm with adequate modifications has shown significant gains in modeling jump-diffusion SDEs [33] and in fault tolerance applications [27]. A more detailed overview of applications of MLMC can be found in [19]. The MLMC for estimating distribution functions is described in the recent paper [17], which is adapted to our setting.

The article is structured as follows. In Section 2, we define the ct-SHS model and the probabilistic reachability problem. In Sections 3 and 4, we discuss the standard Monte Carlo technique and the MLMC method, respectively, and compare their convergence rates. We then discuss two technical modifications: applying a smoothing operator to the discontinuous function of exit time (Section 5) and an adaptive MLMC algorithm for estimating the hyperparameters (Section 6). In Section 7, we provide simulation results for an example.

2 Model Definition

We study statistical model checking for the rich class of continuous-time stochastic hybrid systems (ct-SHS).

2.1 Continuous-Time Stochastic Hybrid Systems

Definition 1

A continuous-time stochastic hybrid system is a tuple $\mathcal{H}=\left(Q,\mathcal{X},b,\sigma,x_{0},r\right)$ where the components are defined as follows.

States

$Q$ is a countable set of discrete states (modes) and $\mathcal{X}:Q\rightarrow\mathcal{P}(\mathbb{R}^{n})$ maps each mode $q\in Q$ to an open set $\mathcal{X}(q)\subseteq\mathbb{R}^{n}$ , called the invariant for the mode $q$ . A state $(q,z)$ with $q\in Q$ and $z\in\mathcal{X}(q)$ is called a hybrid state. The hybrid state space $X$ is defined as

[TABLE]

We write $\partial Z$ for the boundary of a set $Z$ and define $\partial X:=\{{(q,z)\mid q\in Q,z\in\partial\mathcal{X}(q)}\}$ .

Evolution

$b:X\rightarrow\mathbb{R}^{n}$ is a vector field and $\sigma:X\rightarrow\mathbb{R}^{n\times m}$ is a matrix-valued function, with $n,m\in\mathbb{N}_{0}$ , where $X$ is the hybrid space defined in (1). For each $q\in Q$ , define the following SDE:

[TABLE]

where $(W_{t},\,\,t\geq 0)$ is an $m$ -dimensional standard Wiener process in a complete probability space. We assume functions $b(q,\cdot):\mathcal{X}(q)\rightarrow\mathbb{R}^{n}$ and $\sigma(q,\cdot):\mathcal{X}(q)\rightarrow\mathbb{R}^{n\times m}$ are bounded and Lipschitz continuous for all $q\in Q$ . The assumption ensures the existence and uniqueness of the solution of the SDEs in (2).

Initial State

$x_{0}\in X$ is the initial state of the system;

Transition Kernel

$r:\partial X\times Q\rightarrow[0,1]$ is a discrete stochastic kernel which governs the switching between the SDEs defined in (2). That is, for all $q\in Q$ , we assume $r(\cdot,q)$ is measurable and, for all $x\in\partial X$ , the function $r(x,\cdot)$ is a discrete probability measure.

Intuitively, an execution of a ct-SHS starts in the initial state $x_{0}$ , and evolves according to the solution of the diffusion process (2) for the current mode until it hits the boundary of the invariant of the current mode for the first time. At this point, a new mode $q^{\prime}$ is chosen according to the transition kernel $r$ and the execution proceeds according to the solution of the diffusion process for $q^{\prime}$ , and so on.

We need the following definitions. Let $z^{q}(t),\,\,q\in Q$ be the solution of diffusion process (2) starting from $z^{q}(0)\in\mathcal{X}(q)$ . Define $t^{\ast}(q)$ as the first exit time of $z^{q}(t)$ from the set $\mathcal{X}(q)$ ,

[TABLE]

A stochastic hybrid process, describing the evolution of a ct-SHS, is obtained by the concatenation of diffusion processes $\{z^{q}(t),\,\,q\in Q\}$ together with a jumping mechanism given by a family of first exit times $t^{\ast}(q)$ ; we make this formal in Definition 2.

Definition 2

A stochastic process $x(t)=(q(t),z(t))$ is called an execution of ct-SHS $\mathcal{H}$ if there exists a sequence of stopping times $T_{0}=0<T_{1}<T_{2}<\ldots$ such that for all $k\in\mathbb{N}_{0}$ :

•

$x(0)=(q_{0},z_{0})\in X$ is the initial state of $\mathcal{H}$ ;

•

for $t\in[T_{k},T_{k+1})$ , $q(t)=q(T_{k})$ is constant and $z(t)$ is the solution of SDE

[TABLE]

where $W_{t}$ is the $m$ -dimensional standard Wiener process;

•

$T_{k+1}=T_{k}+t^{\ast}(q(T_{k}))$ where $t^{\ast}(q(T_{k}))$ is the first exit time from the mode $q(T_{k})$ as defined in (3);

•

The probability distribution of $q(T_{k+1})$ is governed by the discrete kernel $r((q(T_{k}),z(T_{k+1}^{-})),\cdot)$ and $z(T_{k+1})=z(T_{k+1}^{-})$ , where $z(T_{k+1}^{-}):=\lim_{t\uparrow T_{k+1}}z(t)$ .

Remark 1

For simplicity of exposition, we have put the following restrictions on the ct-SHS model $\mathcal{H}$ in Definition 1. First, the model includes only forced jumps activated by reaching the boundaries of the invariant sets $\partial\mathcal{X}(q),\,q\in Q$ and does not capture spontaneous jumps activated by Poisson processes. Second, the continuous state $z(t)$ remains continuous at the switching times as declared in Definition 2. The approach of this paper is still applicable for ct-SHS models without these restrictions by modifying the time discretization scheme presented in Section 3.

2.2 Example: Thermostatically Controlled Loads

Household appliances such as water boilers/heaters, air conditioners, and electric heaters –-all referred to as thermostatically controlled loads (TCLs)-– can store energy due to their thermal mass. TCLs have been extensively studied [13, 22, 26] for their role in energy management systems. TCLs generally operate within a dead-band around a temperature set-point and are naturally modeled using ct-SHS. The temperature evolution in a cooling TCL can be characterized by the following SDE:

[TABLE]

where $\theta_{a}$ is the ambient temperature, $P_{rate}$ is the energy transfer rate of the TCL, and $R$ and $C$ are the thermal resistance and capacitance, respectively. The noise term $W_{t}$ in (4) is a standard Wiener process. The model of the TCL has two discrete modes. When $q(t)=0$ , we say the TCL is in the OFF mode at time $t$ , and when $q(t)=1$ , we say it is in the ON mode.

The temperature of the cooling TCL is regulated by a control signal $q(t^{+})=f(q(t),\theta(t))$ based on discrete switching as

[TABLE]

where $\theta_{s}$ denotes a temperature set-point and $\delta_{d}$ a dead-band. Together, $\theta_{s}$ and $\delta_{d}$ characterize an operating temperature range. The model can be described by the ct-SHS $\mathcal{H}_{TCL}=\left(Q,\mathcal{X},b,\sigma,x_{0},r\right)$ , where

•

$Q=\{0,1\}$ with the invariants $\mathcal{X}(0)=(-\infty,\theta_{+})$ and $\mathcal{X}(1)=(\theta_{-},+\infty)$

•

state space of the model $X=\{0\}\times(-\infty,\theta_{+})\cup\{1\}\times(\theta_{-},+\infty)$

•

$b(q,\theta)=\frac{1}{CR}(\theta_{a}-qRP_{rate}-\theta)$ for all $(q,\theta)\in X$

•

$\sigma(0,\theta)=\sigma(0),\sigma(1,\theta)=\sigma(1)$ for all $(q,\theta)\in X$

•

$r(q^{+}\mid q,\theta)$ is the Kronecker delta with $q^{+}=f(q,\theta)$ .

2.3 Problem Definition

For a given random variable defined on the executions of a ct-SHS, we study the problem of estimating its distribution function.

Problem 1

Let $Y$ be a real-valued random variable defined on the executions of ct-SHS $\mathcal{H}$ . Estimate $F_{Y}(s):=\mathbb{P}(Y\leq s)$ , the distribution of $Y$ for a given $s\in\mathbb{R}$ .

Consider a ct-SHS $\mathcal{H}$ with state space $X$ , a safe set $A\subset X$ , assumed to be measurable, and a time interval $[0,s]\subset\mathbb{R}_{\geq 0}$ . The safety problem asks to compute the probability that the executions of $\mathcal{H}$ will stay in $A$ during time interval $[0,s]$ . The safety problem is dual to the reachability problem and has a fundamental role in model checking for ct-SHS. By taking $Y$ in Problem 1 to be the first exit time of the system from $A$ , we reduce the safety problem to Problem 1.

Problem 2 (Probabilistic Safety)

Compute the probability that an execution of the ct-SHS $\mathcal{H}$ , with initial condition $x_{0}\in X$ , remains within a measurable set $A$ during the bounded time horizon $[0,s]$ :

[TABLE]

where $Y:=\min\{t\in\mathbb{R}_{\geq 0}\cup\{\infty\}\,|\,x(t)\notin A,x(0)=x_{0}\}$ and $F_{Y}(s)=\mathbb{P}(Y\leq s)$ .

Remark 2

The random variable $Y$ defined in Problem 2 is in fact the first exit time of the system $\mathcal{H}$ from the safe set $A$ and its distribution can be represented as the expectation of an indicator functional:

[TABLE]

Problem 3 (Specification of interest for TCL)

Although the switching mechanism (5) is designed to keep the temperature inside the interval $[\theta_{-},\theta_{+}]$ , there is still a chance that the temperature goes out of this interval due to the Wiener process $W_{t}$ . Define a random variable $Y=\max\left\{\theta_{t},\,t\in[0,s]\right\}$ . We aim to estimate the probability $\mathbb{P}(Y\leq\theta_{+}+0.1\cdot\delta_{d})$ .

Analytic solution of Problems 1-3 is infeasible for the class of ct-SHS. Numerical computation of the solution has been investigated for restrictive subclasses of ct-SHS [32, 1]. In this work, we propose an approximate computation technique with a confidence bound. Our technique based on MLMC substantially improves the computational complexity of the standard Monte Carlo method. We first discuss standard Monte Carlo (SMC) method in Section 3 and then present the MLMC method in Section 4.

3 Standard Monte Carlo Method

In order to compute the quantities of interest in Problems 1-2 we need to estimate

[TABLE]

where $Y$ is a function of the execution of ct-SHS $\mathcal{H}$ , $g:\mathbb{R}\rightarrow\mathbb{R}$ is the indicator function over the interval $(-\infty,s]$ and $P:=g(Y)$ is a one-dimensional random variable. The exact executions of $\mathcal{H}$ and thus exact samples of $Y$ are not available is general but it is possible to construct approximate executions and approximate samples that converge to the exact ones in a suitable sense.

Alg. 1 presents a state update routine based on the Euler-Maruyama method that can be used to construct approximate executions. Given the model $\mathcal{H}$ and the current approximate state $(q_{k},z_{k})$ , this algorithm computes the approximate state $(q_{k+1},z_{k+1})$ for the next time step of size $\Delta$ . Equation (8) in step 1 of the algorithm is the Euler-Maruyama approximation of the SDE (2). If $z_{\textsf{aux}}$ is still inside the invariant of the current mode $\mathcal{X}(q_{k})$ , then the mode remains unchanged and $z_{\textsf{aux}}$ will be the next state (steps 2-3). Otherwise, in steps 5-6 $z_{\textsf{aux}}$ is projected onto the boundary $\partial\mathcal{X}({q_{k}})$ of the invariant and the mode is updated according to the discrete kernel $r(q_{k},z_{k+1})$ .

Alg. 2 generates approximate executions of $\mathcal{H}$ and approximate samples of $Y$ using Alg. 1. The algorithm requires the model $\mathcal{H}$ , the definition of $Y$ as a function of the execution of of $\mathcal{H}$ , and the time interval $[0,s]$ . The output of the algorithm $\theta^{\ell}$ is an approximate sample of random variable $Y$ . In steps 1-2 the number of time steps $n$ is selected and the discretization time step $\Delta$ is computed. In order to highlight the dependency of the algorithm to the parameter $n$ , we have opted to use $\ell$ in the representation $n=\kappa 2^{\ell}$ as the superscript of the variables. We call $\ell$ the level of approximation which is nicely connected to the MLMC terminology discussed in Section 4.

Alg. 2 initializes the approximate execution in step 3 as $x^{\ell}_{0}:=(q_{0}^{\ell},z_{0}^{\ell})$ according to $x_{0}$ the initial state of $\mathcal{H}$ . Then the algorithm iteratively computes the next approximate state $(q_{k+1}^{\ell},z_{k+1}^{\ell})$ by sampling from the $m$ -dimensional standard normal distribution in step 5 and applying Alg. 1 to $(\mathcal{H},q_{k}^{\ell},z_{k}^{\ell},\Delta,W_{k}^{\ell})$ in step 6. Finally, step 9 constructs the continuous-time approximate execution $(q^{\ell}(\cdot),z^{\ell}(\cdot))$ as the piecewise constant version of the discrete execution $(q_{k}^{\ell},z_{k}^{\ell})$ , which enables the computation of $\theta^{\ell}$ by applying the definition of $Y$ to $(q^{\ell}(\cdot),z^{\ell}(\cdot))$ (step 10).

Alg. 2 is parameterized by $\ell$ . Due to the nature of the Euler-Maruyama method in (8), we expect that the approximate samples $\theta^{\ell}$ converge to $Y$ as $\ell\rightarrow\infty$ in a suitable way. In fact, it is an unbiased estimator in the limit: $\lim_{\ell\to\infty}\mathbb{E}g\left(\theta^{\ell}\right)=\mathbb{E}g\left(Y\right).$ The idea behind standard Monte Carlo (SMC) method is to use the empirical mean of $g\left(\theta^{\ell}\right)$ as an approximation of $\mathbb{E}g\left(Y\right)$ . The SMC estimator has the form

[TABLE]

which is based on $N$ replications of $\theta^{\ell}$ . The replications $\{\theta_{i}^{\ell},i=1,\ldots,N\}$ can be generated by running Alg. 2 (with a fixed $\ell$ ) $N$ times, or running any other algorithm that generates such samples (cf. Alg. 4 in Section 4). The SMC method is summarized in Alg. 3, which approximates $\mathbb{E}g(Y)$ based on a general sampling algorithm ${\mathcal{A}}_{\ell}$ . Note that Alg. 3 can be used for estimating $\mathbb{E}g(Y)$ not only with $g(\cdot)$ being the indicator function but also any other functional that can be deterministically evaluated using the executions over the time interval $[0,s]$ .

Owing to the randomized nature of algorithm ${\mathcal{A}}_{\ell}$ embedded in Alg. 3, we quantify the quality of its outcome using mean squared error:111 We slightly abuse the notation and indicate by $MSE({\mathcal{A}}_{\ell})$ the mean square error of Alg. 3 with the embedded sampling algorithm ${\mathcal{A}}_{\ell}$ .

[TABLE]

The mean square error $MSE({\mathcal{A}}_{\ell})$ is decomposed into two parts: Monte Carlo variance and squared bias error. The latter is a systematic error arising from the fact that we might not sample our random variable exactly, but rather use a suitable approximation, while the former error comes from the randomized nature of the Monte Carlo algorithm. The Monte Carlo variance (first term in (10)) is proportional to $N^{-1}$ as

[TABLE]

The cost of Alg. 3 is typically taken to be the expected runtime in order to achieve a prescribed accuracy $\mathit{MSE}({\mathcal{A}}_{\ell})\leq\varepsilon$ . A more convenient approach for theoretical comparison between different methods is to consider the cost associated to sampling algorithm ${\mathcal{A}}_{\ell}$ ,

[TABLE]

which facilitates the definition of convergence rate of the algorithm.

Definition 3

We say that Alg. 3 based on sampling algorithm ${\mathcal{A}}_{\ell}$ converges with rate $\gamma>0$ if $\lim\limits_{\ell\to\infty}\sqrt{MSE\left({\mathcal{A}}_{\ell}\right)}=0$ and if there exist constants $c>0,\,\eta\geq 0$ such that

[TABLE]

Remark 3

The definition of convergence rate in (11) indicates that for a desired accuracy $MSE\left({\mathcal{A}}_{\ell}\right)\leq\varepsilon$ smaller convergence rate $\gamma$ implies lower computational cost $C_{\ell}\left({\mathcal{A}}_{\ell}\right)$ .

The following theorem presents the convergence rate of the SMC method presented in Alg. 3.

Theorem 3.1

Let $\theta^{\ell}$ denote the numerical approximation of the random variable $Y$ according to an algorithm ${\mathcal{A}}_{\ell}$ . Assume there exist positive constants $\alpha,\zeta,c_{1},c_{2}$ such that for all $\ell\in\mathbb{N}_{0}$

[TABLE]

Then the standard Monte Carlo method of Alg. 3 based on sampling algorithm ${\mathcal{A}}_{\ell}$ converges with rate $\gamma=2+\cfrac{\zeta}{\alpha}$ .

Remark 4

Recall the role of $\ell$ in step 2 of Alg. 2. Increasing $\ell$ results in an exponential increase in the number of time steps thus also in the number of samples. Therefore we have assumed in (12) an exponential bound on the increased cost and an exponential bound in the decreased bias as a function of $\ell$ .

Application to the TCL Case Study. We construct the approximate discrete-time executions as

[TABLE]

where $W^{\ell}_{k}$ is the sample from the standard normal distribution, $\Delta=s/n$ , $n=\kappa 2^{\ell}$ , and the discrete mode at any level $\ell$ is defined as $q^{\ell}_{k+1}:=f(q^{\ell}_{k},\theta_{k}^{\ell})$ with $f(\cdot)$ defined in (5). This discrete-time updating is slightly different from the Update function of Alg. 1, which can be interpreted as follows. Instead of continuous updating of mode, the control signal acts as a digital controller and updates the mode only at the discrete time steps. It is clear, that the cost of simulating one execution of (13) is proportional to the number of the discretization steps, thus setting the parameter $\zeta=1$ in Theorem 3.1.

The values of constants $\alpha,\zeta,c_{1},c_{2}$ in Theorem 3.1 depend on the regularity of the functional $g$ , sampling algorithm ${\mathcal{A}}_{\ell}$ and other parameters. In the next section we propose to use MLMC method that improves the convergence rate and substantially reduces the computational complexity of the estimation. We discuss a smoothing in Section 5 that replaces the indicator function $g(\cdot)$ with a smoothed function and discuss its effect on the algorithm’s error.

4 Multilevel Monte Carlo Method

The multilevel Monte Carlo method (MLMC) relies on the simple observation of telescoping sum for expectation:

[TABLE]

where $\theta^{0}$ and $\theta^{L}$ correspond respectively to the coarsest and finest levels of numerical approximation. While any of the approximations $\{\theta^{0},\theta^{1},\ldots,\theta^{L}\}$ can be used individually in Alg. 3 to approximate $Y$ , instead, the MLMC method independently estimates each of the expectations on the right-hand side of (14) such that the overall variance is minimized for a given computational cost. The estimator $\hat{P}$ of $\mathbb{E}g\left(\theta^{L}\right)$ can be seen as a sum of independent estimators

[TABLE]

where $P^{0}$ is an estimator for $\mathbb{E}g\left(\theta^{0}\right)$ based on $N_{0}$ samples, and $P^{\ell}$ are estimates for $\mathbb{E}\left[g\left(\theta^{\ell}\right)-g\left(\theta^{\ell-1}\right)\right]$ based on $N_{\ell}$ samples. As we saw in the MSC method of Section 3, the simplest forms for $P^{0}$ and $P^{\ell}$ are the empirical means over all samples:

[TABLE]

Using the assumption of having independent estimators $\{P^{0},P^{1},P^{2},\ldots,P^{L}\}$ and employing the telescoping sum (14) we can compute respectively the variance of $\hat{P}$ and bias as

[TABLE]

The computation of $P^{\ell}$ in (16) requires the samples $\theta^{\ell}_{i},\theta^{\ell-1}_{i}$ to be generated from a common probability space. We utilize the fact that sum of normal random variables is still normally distributed. Alg. 4 presents generation of approximate coupled samples $\theta^{\ell}_{i},\theta^{\ell-1}_{i}$ for the random variable $Y$ defined on the execution of a ct-SHS $\mathcal{H}$ . As can be seen in steps 6-7 and 11, the approximate execution for the finer level $\ell$ is constructed exactly the same way as in Alg. 2 with $n_{f}=\kappa 2^{\ell}$ time steps. The construction of approximate execution for the coarser level $(\ell-1)$ with $n_{c}=\kappa 2^{\ell-1}$ is also similar except that the noise term in step 8 is obtained by taking the weighted sum of noise terms from the finer level $(W_{2k}^{\ell}+W_{2k+1}^{\ell})/\sqrt{2}$ . This choice preserves the properties of each approximation level while coupling the executions of levels $\ell-1,\ell$ thus also coupling approximate samples $\theta^{\ell-1},\theta^{\ell}$ .

Now we are ready to present the MLMC method in Alg. 5. The method is parameterized by the number of levels $L$ , number of samples for each level $N_{\ell}$ , $\ell=0,1,\ldots,L$ (which are gathered in $\mathfrak{S}$ ), and the initial number of time steps $\kappa$ . Steps 2-3 performs the SMC method of Alg. 3 with embedded sampling algorithm 2 in order to estimate $\mathbb{E}g(\theta^{0})$ with $N_{0}$ samples at the initial level $\ell=0$ . Then the algorithm iteratively estimate $\mathbb{E}[g(\theta^{l})-g(\theta^{l-1})]$ in steps 6-7 using Alg. 3 with number of samples $N=N_{l}$ and with the embedded coupled sampling algorithm 4. The sum estimated quantity is reported in step 10 as the estimation of $\mathbb{E}g(Y)$ .

The next theorem gives the convergence rate of MLMC method presented in Alg. 5.

Theorem 4.1

Let $\theta^{\ell}$ denote the level $\ell$ numerical approximation of the random variable $Y$ . Assume the independent estimators $P_{\ell}$ used in Alg. 5 satisfy

[TABLE]

for positive constants $\alpha,\beta,\zeta,c_{1},c_{2},c_{3}$ with $\alpha\!\geq\!{\textstyle\frac{1}{2}}\,\min(\beta,\zeta)$ . Then the MLMC method in Alg. 5 converges with rate $2+\cfrac{\max(\zeta\!-\!\beta,0)}{\alpha}.$

Assumptions in (17) are exactly the same as the ones used in Theorem 3.1. Assumptions in (20) put restriction on the statistical properties of the estimators $P^{\ell}$ : they first enables us to use the telescoping property (14) and the second ensures the exponentially decaying variance as a function of level $\ell$ . In compare with the convergence rate of SMC method in Theorem 3.1, the improvement is due to the non-zero factor $\beta$ which is the decaying rate of the variance of estimators.

Application to the TCL Case Study. We construct the approximate discrete-time executions for the finer lever $\ell$ as

[TABLE]

where $W^{\ell}_{k}$ is the sample from the standard normal distribution, $\Delta_{f}=s/n_{f}$ , $n_{f}=\kappa 2^{\ell}$ , and with $f(\cdot)$ defined in (5). The coupling, which means that we get the dynamics for $\theta^{\ell,c}$ based on the increments for $\theta^{\ell,f}$ , is done in a following way:

[TABLE]

where $\Delta_{c}=s/n_{c}$ with $n_{c}=\kappa 2^{\ell-1}$ . The fact that we have used the same Brownian increments $W^{\ell}_{2k-1},W^{\ell}_{2k}$ from the finer level (21) in the courser level (22) lays the foundation of having nonzero value of $\beta$ in Theorem 4.1. The cost of simulating one approximate execution in (21)-(22) is proportional to the number of discretization steps, thus setting the parameter $\zeta=1$ in Theorems 3.1-4.1.

Now that we have set up the MLMC method and the coupling technique that improves the convergence rate of the estimation, we focus on the following important problems associated with the approach:

Discontinuity of functional $g(Y)=1_{(-\infty,s]}(Y)$ , leads to smaller values of $\alpha$ and $\beta$ in Theorem 4.1. This results in larger convergence rate $\gamma$ thus larger computational cost for a given accuracy $\varepsilon$ . 2. 2.

The optimal choice of parameters $N_{\ell}$ , $L$ and the unknown constants in Theorem 4.1.

The first issue, discussed in Section 5, is resolved through smoothing, which replaces the discontinuous function $g$ with a smoothed function $g^{\delta}$ with Lipschitz constant proportional to $\delta^{-1}$ . The second issue, discussed in Section 6, is resolved through an adaptive algorithm. This adaptive algorithm follows [20], and combines the smoothing of discontinuous functionals and the MLMC method. Note that we require an updated set of assumptions and include the search for parameter $\delta$ into the adaptive algorithm.

5 MLMC with Smoothed Indicator Function

The smoothing is based on the function $g^{\delta}:\mathbb{R}\to\mathbb{R}$ , which are the rescaled translates of a function $g^{0}:\mathbb{R}\to\mathbb{R}$ of the form

[TABLE]

Since we add a smoothing step, we need to update the MLMC estimator (15), derive new a MSE decomposition (instead of (10)) which incorporates the error due to the smoothing, and update Assumptions (17)-(20) in Theorem 4.1.

Note that function (23) is not the only possible choice for a smoothing function (see [20]), but in our experience this is the easiest to implement and numerically stable, while still providing significant gains in computational cost.

Recall that the MLMC method is based on a sequence $(\theta^{\ell})_{\ell\in\mathbb{N}_{0}}$ of random variables, defined on a common probability space together with $Y$ . The new MLMC method that includes smoothing is defined by

[TABLE]

with an independent family of $\mathbb{R}^{2}$ -valued random variables $(\theta^{\ell,f}_{i},\theta^{\ell,c}_{i})$ for $i=1,\dots,N_{\ell}$ and $\ell=0,1,\dots,L$ such that equality in distribution holds for $(\theta^{\ell,f}_{i},\theta^{\ell,c}_{i})$ and $(\theta^{\ell},\theta^{\ell-1})$ , where we used the notation $(\theta^{0,f}_{i},\theta^{0,c}_{i})=(\theta^{0}_{i},0)$ for the initial level $\ell=0$ . Note that (24) is the same as the MLMC estimator (15) except using the smoothing function $g^{\delta}(\cdot)$ instead of the indicator function $g(\cdot)$ . The next theorem gives the mean square error decomposition for (24).

Theorem 5.1

For $\delta>0$ , the error of ${\mathcal{M}}^{\delta,L}_{\mathfrak{S}}$ in (24) with smoothing function (23) can be decomposed as

[TABLE]

The error terms in (25) are related to smoothing, bias, and variance, respectively. Note that as $\delta$ goes to zero, the Lipschitz constant for $g^{\delta}(x)$ goes to infinity, which has to be taken into account. Hence the assumptions in Theorem 4.1 have to be updated. The theoretical analysis and updated assumptions are presented in [20].

6 Adaptive MLMC Algorithm

In this section we present an adaptive algorithm to find the optimal parameters for the MLMC method. For a given $\varepsilon>0$ we wish to select the parameters of the MLMC algorithm such that its error is at most $\varepsilon$ and its cost is as small as possible. Our approach to the selection of the replication numbers and of the maximal level follows [19].

The adaptive algorithm assumes no prior knowledge on the smoothing parameter $\delta$ , along with bias and variance dependencies on it. The smoothing parameter $\delta$ is chosen from the discrete set of values $\delta_{m}=1/2^{m},$ where $m\in\mathbb{N}$ . With a slight abuse of notation we put $g^{m}=g^{\delta_{m}}.$ In order to achieve $MSE({\mathcal{M}}^{\delta,L}_{\mathfrak{S}})\leq\varepsilon$ we have to assign certain proportions of $\varepsilon$ to the three sources of the error introduced in (25). Specifically we wish to choose the parameters of our algorithm such that

[TABLE]

The MLMC algorithm is parameterized by the value $m$ for smoothing $\delta_{m}=1/2^{m}$ , the values of the maximal level $L$ , and the replication numbers $\mathfrak{S}=\left(N_{0},\ldots,N_{L}\right)$ . We always select $L\geq 2$ and $N_{\ell}\geq 100$ for $\ell=0,\dots,L$ . By the latter, we ensure a reasonable accuracy in certain estimates to be introduced below. We use $y_{i,0}$ to denote actual samples of the random variable $\theta^{0}$ and $(y_{i,\ell},y_{i,\ell-1})$ to denote the actual samples of the random vector $(\theta^{\ell},\theta^{\ell-1})$ for $\ell=1,\dots,L$ as opposed to $\theta_{i}^{\ell,f},\theta_{i}^{\ell,c}$ which were used previously for their respective random variables.

Assumptions. Theorem 4.1 relies on the assumption of exponential upper bounds in (17)-(20), which in general might be difficult to verify. Instead in this section we study asymptotic upper bounds. For this purpose we use the following notation. For sequences of real numbers $u_{\ell}$ and positive real numbers $w_{\ell}$ we write $u_{\ell}\approx w_{\ell}$ if $\lim_{\ell\to\infty}u_{\ell}/w_{\ell}=1,$ and write $u_{\ell}{\scriptstyle\;\lesssim\;}w_{\ell}$ if $\limsup_{\ell\to\infty}u_{\ell}/w_{\ell}\leq 1.$ We also replace assumptions (17)-(20) with the requirement that for every $m$ there exists $c,\alpha>0$ such that

[TABLE]

This yields the following asymptotic upper bound for the bias at level $\ell$

[TABLE]

We put $C_{r}=2^{r+1}$ with $r=3$ , the degree of polynomial in (23), and suppose that there exists $c>0$ such that $\bigl{|}\mathbb{E}(g^{m}(Y))-\mathbb{E}(g^{m-1}(Y))\bigr{|}\approx c\cdot\delta_{m}^{4}.$ This yields the asymptotic upper bound for the smoothing error with parameter $\delta_{m}$ ,

[TABLE]

Our adaptive MLMC algorithm is based on the intuition that the asymptotic bounds (28) and (29) can be replaced by their corresponding inequalities ( $\leq$ instead of ${\scriptstyle\;\lesssim\;}$ ), and estimators for means and variances can be assumed to be nearly exact.

Variance Estimation and Selection of the Replication Numbers. To estimate the expectations and variances we employ the empirical mean and variance

[TABLE]

We get that $\hat{v}(\mathfrak{S})=\sum_{\ell=0}^{L}\frac{1}{N_{\ell}}\cdot\hat{v}_{\ell}$ serves as an empirical upper bound for the variance of the MLMC algorithm with any choice of replication numbers $\mathfrak{S}=\left(N_{0},N_{1},\ldots,N_{L}\right)$ . If, for the present choice of replication numbers, this bound is too large compared to the upper bound for $\operatorname{Var}({\mathcal{M}}^{\delta,L}_{\mathfrak{S}})$ in (26), i.e., if the variance constraint

[TABLE]

is violated, we determine new values of $N^{\prime}_{0},\dots,N^{\prime}_{L}$ by minimizing $c(N_{0},\dots,N_{L})$ subject to the constraint $\hat{v}(\mathfrak{S})\leq a_{3}^{2}\cdot\varepsilon_{*}^{2}$ , which leads to

[TABLE]

and extra samples of $\theta^{0}$ and $(\theta^{\ell},\theta^{\ell-1})$ have to be generated accordingly.

Bias Estimation and Selection of the Maximal Level. For estimating $|\mathbb{E}(g^{m}(\theta^{\ell}))-\mathbb{E}(g^{m}(\theta^{\ell-1}))|$ we can use the values of $|\hat{b}_{\ell}|$ already available from (30) for the levels $\ell=1,\dots,L$ . We estimate $\alpha$ and $c$ in (27) by a least-squares fit, i.e., we take $\hat{\alpha}$ and $\hat{c}$ to minimize

[TABLE]

While the value of $\hat{c}$ is irrelevant, an upper bound for $\left|\mathbb{E}(g^{m}(\theta^{L}))-\mathbb{E}(g^{m}(\theta^{L-1}))\right|$ is given by $|\hat{b}_{L}|$ , or, more generally, by $2^{(\ell-L)\cdot\hat{\alpha}}\cdot|\hat{b}_{\ell}|$ with $\ell\leq L$ . This geometric upper bound can be used to set the stopping criterion of increasing the maximal level. Let us define

[TABLE]

The present value of $L$ is accepted as the maximal level, if the bias constraint

[TABLE]

is satisfied. Otherwise, $L$ is increased by one, and new samples will be generated.

Selection of the Smoothing Parameter. We wish to determine the smallest value of $m$ , i.e., the largest value of $\delta_{m}$ , such that

[TABLE]

is satisfied, which corresponds to the upper bound for $e_{1}$ in (26) together with (29). Initially we try $m=2$ . Actually, $Y$ is approximated by $\theta^{L}$ , so the present value is accepted if

[TABLE]

The Adaptive Algorithm. We combine the above results and sum them up in Alg. 6, where the desired accuracy $\varepsilon$ is the input.

7 Simulation Results

Recall Problem 3 where the goal is to estimate the probability $\mathbb{P}(Y\leq\theta_{+}+0.1\cdot\delta_{d})$ . The random variable $Y$ is defined as $Y=\max\{\theta_{t},\,t\in[0,s]\}$ . We set the parameters of the TCL model (4)-(5) according to Table 1 and select the time horizon $s=1$ hour. We implement the MLMC Alg. 6 for target accuracies $\varepsilon=2^{-k}$ , where $k\in\{3,\ldots,8\}.$ We set the parameters $a_{1}=4$ , $a_{2}=a_{3}=2$ in (26). With this choice we put less pressure on the smoothing error because the influence of the smoothing parameter $\delta$ on the variance and thus on the overal cost is severe. Due to the smoothing step we have to sample executions for the time duration of at least $(s+\delta)$ in order to evaluate the functional $g(Y)$ . With the selected values of $s$ and $\varepsilon$ , sampling executions for $1.5$ hours is sufficient.

The result of the experiments is presented in Figure 1. The left and center plots show the impact of the smoothing coefficient on the variance and mean decays respectively based on $10^{6}$ runs of the algorithm. The data points of the plots with $\ell=1$ and with the indicator function are related to the SMC method. These plots indicate that the adaptive MLMC method is beneficial over SMC method due to the strong variance and mean decay with respect to level $\ell$ as well as the use of smoothing function instead of the indicator function.

The computational gain of the MLMC over SMC is presented on the right plot based on $100$ runs. The plot compares the expected cost of the SMC method with the estimated cost of the adaptive MLMC method. The cost of SMC method is given by $\varepsilon^{-2-\frac{1}{\bar{\alpha}}}$ (see Theorem 3.1), which bounds the cost of generating executions and evaluating functionals. We estimate the parameter $\bar{\alpha}$ through the precalculation and do not take into account the cost of estimating $\bar{\alpha}$ . In this way we assume the parameter $\bar{\alpha}$ is known in advance and make the comparison more in favor of the SMC method. The plot indicates larger computational gains for higher target accuracies (smaller $\varepsilon$ ). Note that the curve in the right plot is not monotone because there is an additional cost of updating the smoothing coefficient, hence re-evaluating the functionals with the new value of $\delta$ . This additional cost has not been compensated by the MLMC gains as much in compare with the neighboring accuracies.

8 Conclusions

In this paper we studied the problem of statistical model checking of continuous-time hybrid systems that do not admit exact simulations. We employed multilevel Monte Carlo method and presented a smoothing step with tunable precision that replaces the desired discontinuous functional with a continuous approximation thus decreasing the overall computational effort of the approach. An adaptive algorithm was designed which balances the errors due to the bias, variance, and smoothing. The approach was demonstrated on the model of thermostatically controlled loads.

Bibliography35

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] A. Abate, L. Bortolussi, M. Kwiatkowska, L. Cardelli, M. Ceska, and L. Laurenti. Reachability computation for switching diffusions: finite abstractions with certifiable and tuneable precision. In Hybrid Systems: Computation and Control , HSCC ’17, New York, NY, USA, 2017. ACM.
2[2] A. Abate, M. Prandini, J. Lygeros, and S. Sastry. Probabilistic reachability and safety for controlled discrete time stochastic hybrid systems. Automatica , 44(11):2724–2734, Nov 2008.
3[3] C. Baier, B. Haverkort, H. Hermanns, and J.-P. Katoen. Model-checking algorithms for continuous-time Markov chains. Trans. on Software Engineering , 29(6):524–541, June 2003.
4[4] C. Baier and J.-P. Katoen. Principles of Model Checking . MIT Press, 2008.
5[5] M.L. Bujorianu and J. Lygeros. Reachability questions in piecewise deterministic Markov processes. In O. Maler and A. Pnueli, editors, Hybrid Systems: Computation and Control , volume 2623 of Lecture Notes in Computer Science , pages 126–140. Springer Verlag, 2003.
6[6] M.L. Bujorianu and J. Lygeros. General stochastic hybrid systems: Modelling and optimal control. In in Proc. 43rd IEEE Conf. Decision Control , pages 1872–1877, 2004.
7[7] P. Bulychev, A. David, K.G. Larsen, A. Legay, M. Mikučionis, and D.B. Poulsen. Checking and Distributing Statistical Model Checking , pages 449–463. Springer, Berlin, Heidelberg, 2012.
8[8] C.G. Cassandras and J. Lygeros (Eds.). Stochastic Hybrid Systems . Number 24 in Control Engineering. CRC Press, Boca Raton, 2006.