An analytical safe approximation to joint chance-constrained programming   with additive Gaussian noises

Nan Li; Ilya Kolmanovsky; Anouck Girard

arXiv:1903.00643·math.OC·March 5, 2019·IEEE Trans. Autom. Control.

An analytical safe approximation to joint chance-constrained programming with additive Gaussian noises

Nan Li, Ilya Kolmanovsky, Anouck Girard

PDF

Open Access

TL;DR

This paper introduces an analytical safe approximation for joint chance-constrained programming with Gaussian noises, offering a less conservative alternative to existing methods without requiring numerical sampling.

Contribution

The paper presents a novel analytical safe approximation method for joint chance constraints with Gaussian noise, improving over previous approaches by reducing conservatism.

Findings

01

The new approximation is less conservative than Boole's inequality-based methods.

02

It is formulated as a standard nonlinear program under mild assumptions.

03

The approach is validated through control of linear Gaussian-Markov models.

Abstract

We propose a safe approximation to joint chance-constrained programming where the constraint functions are additively dependent on a normally-distributed random vector. The approximation is analytical, meaning that it requires neither numerical integrations nor sampling-based probability approximations. Under mild assumptions, the approximation is a standard nonlinear program. We compare this new safe approximation to another analytical safe approximation for joint chance-constrained programming based on Boole's inequality through two examples representing the constrained control of linear Gaussian-Markov models. It is shown that our proposed safe approximation has a lower degree of conservatism compared to the one based on Boole's inequality.

Tables1

Table 1. TABLE I: Comparison results of Example 1 .

		Sol. of (5)	Sol. of (40)
$β = 0.6$	$J$	597.7	729.7
$β = 0.6$	$\bar{β}$	0.7737	0.9577
$β = 0.8$	$J$	695.9	788.4
$β = 0.8$	$\bar{β}$	0.9107	0.9782

Equations129

x \in X min

x \in X min

\displaystyle\quad\mathbb{P}\big{(}F(x,\xi)\leq 0\big{)}\geq\beta,

\mathbb{P}\Big{(}\bigcap_{i=1}^{n_{1}}\big{(}\sum_{j=1}^{n_{2}}\alpha_{ij}\leq\gamma_{i}\big{)}\Big{)}\geq\beta

\mathbb{P}\Big{(}\bigcap_{i=1}^{n_{1}}\big{(}\sum_{j=1}^{n_{2}}\alpha_{ij}\leq\gamma_{i}\big{)}\Big{)}\geq\beta

\displaystyle\mathbb{P}\Big{(}\bigcap_{i=1}^{n_{1}}\big{(}\bigcap_{j=1}^{n_{2}}(\alpha_{ij}\leq\eta_{ij})\big{)}\Big{)}\geq\beta,

\displaystyle\mathbb{P}\Big{(}\bigcap_{i=1}^{n_{1}}\big{(}\bigcap_{j=1}^{n_{2}}(\alpha_{ij}\leq\eta_{ij})\big{)}\Big{)}\geq\beta,

j = 1 \sum n_{2} η_{ij} \leq γ_{i}, i = 1, \dots, n_{1} .

x min

x min

\displaystyle\quad\mathbb{P}\big{(}M\phi(x)\leq m\big{)}\geq\beta,

x, {β_{j}^{1}, β_{j}^{2}}_{j = 1}^{n_{ϕ}} min J (x),

x, {β_{j}^{1}, β_{j}^{2}}_{j = 1}^{n_{ϕ}} min J (x),

\displaystyle\prod_{j=1}^{n_{\phi}}\big{(}\beta_{j}^{1}+\beta_{j}^{2}-1\big{)}\geq\beta,

β_{j}^{1} + β_{j}^{2} \geq 1, j = 1, \dots, n_{ϕ},

0 \leq β_{j}^{σ} \leq 1, j = 1, \dots, n_{ϕ}, σ = 1, 2,

\displaystyle\sum_{j=1}^{n_{\phi}}\Big{(}\sqrt{2\lambda_{j}}\,\big{|}\overline{M}_{ij}\big{|}\,\text{erf}^{-1}\big{(}2\beta_{j}^{\sigma_{ij}}-1\big{)}\Big{)}+\overline{M}_{i}\,\overline{\mu}(x)

\leq m_{i}, i = 1, \dots, n_{m},

\overline{\mu}(x)=\theta^{\top}\mu(x),\quad\overline{\Sigma}=\theta^{\top}\Sigma\theta=\text{diag}\big{(}\lambda_{1},\cdots,\lambda_{n_{\phi}}\big{)}.

\overline{\mu}(x)=\theta^{\top}\mu(x),\quad\overline{\Sigma}=\theta^{\top}\Sigma\theta=\text{diag}\big{(}\lambda_{1},\cdots,\lambda_{n_{\phi}}\big{)}.

\mathbb{P}\big{(}\overline{M}\psi(x)\leq m\big{)}=\mathbb{P}\Big{(}\bigcap_{i=1}^{n_{m}}\big{(}\sum_{j=1}^{n_{\phi}}\overline{M}_{ij}\psi_{j}(x)\leq m_{i}\big{)}\Big{)}\geq\beta,

\mathbb{P}\big{(}\overline{M}\psi(x)\leq m\big{)}=\mathbb{P}\Big{(}\bigcap_{i=1}^{n_{m}}\big{(}\sum_{j=1}^{n_{\phi}}\overline{M}_{ij}\psi_{j}(x)\leq m_{i}\big{)}\Big{)}\geq\beta,

\displaystyle\mathbb{P}\Big{(}\bigcap_{i=1}^{n_{m}}\big{(}\bigcap_{j=1}^{n_{\phi}}(\overline{M}_{ij}\psi_{j}(x)\leq z_{ij})\big{)}\Big{)}=

\displaystyle\mathbb{P}\Big{(}\bigcap_{i=1}^{n_{m}}\big{(}\bigcap_{j=1}^{n_{\phi}}(\overline{M}_{ij}\psi_{j}(x)\leq z_{ij})\big{)}\Big{)}=

\displaystyle\mathbb{P}\Big{(}\bigcap_{j=1}^{n_{\phi}}\big{(}\bigcap_{i=1}^{n_{m}}(\overline{M}_{ij}\psi_{j}(x)\leq z_{ij})\big{)}\Big{)}\geq\beta,

j = 1 \sum n_{ϕ} z_{ij} \leq m_{i}, i = 1, \dots, n_{m},

\prod_{j=1}^{n_{\phi}}\mathbb{P}\Big{(}\bigcap_{i=1}^{n_{m}}\big{(}\overline{M}_{ij}\psi_{j}(x)\leq z_{ij}\big{)}\Big{)}\geq\beta,

\prod_{j=1}^{n_{\phi}}\mathbb{P}\Big{(}\bigcap_{i=1}^{n_{m}}\big{(}\overline{M}_{ij}\psi_{j}(x)\leq z_{ij}\big{)}\Big{)}\geq\beta,

\displaystyle\mathbb{P}\Big{(}\bigcap_{i=1}^{n_{m}}\big{(}\overline{M}_{ij}\psi_{j}(x)\leq z_{ij}\big{)}\Big{)}

\displaystyle\mathbb{P}\Big{(}\bigcap_{i=1}^{n_{m}}\big{(}\overline{M}_{ij}\psi_{j}(x)\leq z_{ij}\big{)}\Big{)}

j = 1 \prod n_{ϕ} β_{j}

i \in ⎩ ⎨ ⎧ I_{j}^{1} I_{j}^{2} I_{j}^{3} if \overline{M}_{ij} > 0, if \overline{M}_{ij} < 0, if \overline{M}_{ij} = 0.

i \in ⎩ ⎨ ⎧ I_{j}^{1} I_{j}^{2} I_{j}^{3} if \overline{M}_{ij} > 0, if \overline{M}_{ij} < 0, if \overline{M}_{ij} = 0.

\displaystyle\mathbb{P}\Big{(}\max_{i\in I_{j}^{2}}\frac{z_{ij}}{\overline{M}_{ij}}\leq\psi_{j}(x)\leq\min_{i\in I_{j}^{1}}\frac{z_{ij}}{\overline{M}_{ij}}\Big{)}=

\displaystyle\mathbb{P}\Big{(}\max_{i\in I_{j}^{2}}\frac{z_{ij}}{\overline{M}_{ij}}\leq\psi_{j}(x)\leq\min_{i\in I_{j}^{1}}\frac{z_{ij}}{\overline{M}_{ij}}\Big{)}=

\displaystyle\mathbb{P}\Big{(}\max_{i\in I_{j}^{2}}\frac{z_{ij}-\overline{M}_{ij}\overline{\mu}_{j}(x)}{\overline{M}_{ij}\sqrt{\lambda_{j}}}\leq\frac{\psi_{j}(x)-\overline{\mu}_{j}(x)}{\sqrt{\lambda_{j}}}

\displaystyle\quad\leq\min_{i\in I_{j}^{1}}\frac{z_{ij}-\overline{M}_{ij}\overline{\mu}_{j}(x)}{\overline{M}_{ij}\sqrt{\lambda_{j}}}\Big{)}\geq\beta_{j},

z_{ij} \geq 0, i \in I_{j}^{3},

\displaystyle F\Big{(}\min_{i\in I_{j}^{1}}\frac{z_{ij}-\overline{M}_{ij}\overline{\mu}_{j}(x)}{\overline{M}_{ij}\sqrt{\lambda_{j}}}\Big{)}-F\Big{(}\max_{i\in I_{j}^{2}}\frac{z_{ij}-\overline{M}_{ij}\overline{\mu}_{j}(x)}{\overline{M}_{ij}\sqrt{\lambda_{j}}}\Big{)}

\displaystyle F\Big{(}\min_{i\in I_{j}^{1}}\frac{z_{ij}-\overline{M}_{ij}\overline{\mu}_{j}(x)}{\overline{M}_{ij}\sqrt{\lambda_{j}}}\Big{)}-F\Big{(}\max_{i\in I_{j}^{2}}\frac{z_{ij}-\overline{M}_{ij}\overline{\mu}_{j}(x)}{\overline{M}_{ij}\sqrt{\lambda_{j}}}\Big{)}

\displaystyle=F\Big{(}\min_{i\in I_{j}^{1}}\frac{z_{ij}-\overline{M}_{ij}\overline{\mu}_{j}(x)}{\overline{M}_{ij}\sqrt{\lambda_{j}}}\Big{)}+

\displaystyle\quad\,F\Big{(}\min_{i\in I_{j}^{2}}-\frac{z_{ij}-\overline{M}_{ij}\overline{\mu}_{j}(x)}{\overline{M}_{ij}\sqrt{\lambda_{j}}}\Big{)}-1\geq\beta_{j},

\displaystyle F\Big{(}\min_{i\in I_{j}^{1}}\frac{z_{ij}-\overline{M}_{ij}\overline{\mu}_{j}(x)}{\overline{M}_{ij}\sqrt{\lambda_{j}}}\Big{)}\geq\beta_{j}^{1},

\displaystyle F\Big{(}\min_{i\in I_{j}^{1}}\frac{z_{ij}-\overline{M}_{ij}\overline{\mu}_{j}(x)}{\overline{M}_{ij}\sqrt{\lambda_{j}}}\Big{)}\geq\beta_{j}^{1},

\displaystyle F\Big{(}\min_{i\in I_{j}^{2}}-\frac{z_{ij}-\overline{M}_{ij}\overline{\mu}_{j}(x)}{\overline{M}_{ij}\sqrt{\lambda_{j}}}\Big{)}\geq\beta_{j}^{2},

β_{j}^{1} + β_{j}^{2} - 1 \geq β_{j} .

i \in I_{j}^{1} min \frac{z _{ij} - M _{ij} μ _{j} ( x )}{M _{ij} λ _{j}}

i \in I_{j}^{1} min \frac{z _{ij} - M _{ij} μ _{j} ( x )}{M _{ij} λ _{j}}

\displaystyle\geq F^{-1}\big{(}\beta_{j}^{1}\big{)}=\sqrt{2}\,\text{erf}^{-1}\big{(}2\beta_{j}^{1}-1\big{)},

\displaystyle\frac{z_{ij}-\overline{M}_{ij}\overline{\mu}_{j}(x)}{\overline{M}_{ij}\sqrt{\lambda_{j}}}\geq\sqrt{2}\,\text{erf}^{-1}\big{(}2\beta_{j}^{1}-1\big{)},

\displaystyle\frac{z_{ij}-\overline{M}_{ij}\overline{\mu}_{j}(x)}{\overline{M}_{ij}\sqrt{\lambda_{j}}}\geq\sqrt{2}\,\text{erf}^{-1}\big{(}2\beta_{j}^{1}-1\big{)},

\displaystyle z_{ij}\geq\sqrt{2\lambda_{j}}\,\overline{M}_{ij}\,\text{erf}^{-1}\big{(}2\beta_{j}^{1}-1\big{)}+\overline{M}_{ij}\overline{\mu}_{j}(x),

z_{ij}\geq-\sqrt{2\lambda_{j}}\,\overline{M}_{ij}\,\text{erf}^{-1}\big{(}2\beta_{j}^{2}-1\big{)}+\overline{M}_{ij}\overline{\mu}_{j}(x),

z_{ij}\geq-\sqrt{2\lambda_{j}}\,\overline{M}_{ij}\,\text{erf}^{-1}\big{(}2\beta_{j}^{2}-1\big{)}+\overline{M}_{ij}\overline{\mu}_{j}(x),

z_{ij}\geq\sqrt{2\lambda_{j}}\,\big{|}\overline{M}_{ij}\big{|}\,\text{erf}^{-1}\big{(}2\beta_{j}^{\sigma_{ij}}-1\big{)}+\overline{M}_{ij}\overline{\mu}_{j}(x),

z_{ij}\geq\sqrt{2\lambda_{j}}\,\big{|}\overline{M}_{ij}\big{|}\,\text{erf}^{-1}\big{(}2\beta_{j}^{\sigma_{ij}}-1\big{)}+\overline{M}_{ij}\overline{\mu}_{j}(x),

j = 1 \prod n_{ϕ} β_{j} \geq β,

j = 1 \prod n_{ϕ} β_{j} \geq β,

β_{j}^{1} + β_{j}^{2} - 1 \geq β_{j}, j = 1, \dots, n_{ϕ},

j = 1 \sum n_{ϕ} z_{ij} \leq m_{i}, i = 1, \dots, n_{m},

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRisk and Portfolio Optimization · Fuzzy Systems and Optimization · Water resources management and optimization

Full text

An analytical safe approximation to joint chance-constrained programming with additive Gaussian noises

Nan Li, Ilya Kolmanovsky, and Anouck Girard The authors are with the Department of Aerospace Engineering, University of Michigan, 1320 Beal Avenue, 48109-2140 Ann Arbor, MI, USA {nanli,ilya,anouck}@umich.eduThis research has been supported by the National Science Foundation award CNS 1544844.

Abstract

We propose a safe approximation to joint chance-constrained programming where the constraint functions are additively dependent on a normally-distributed random vector. The approximation is analytical, meaning that it requires neither numerical integrations nor sampling-based probability approximations. Under mild assumptions, the approximation is a standard nonlinear program. We compare this new safe approximation to another analytical safe approximation for joint chance-constrained programming based on Boole’s inequality through two examples representing the constrained control of linear Gaussian-Markov models. It is shown that our proposed safe approximation has a lower degree of conservatism compared to the one based on Boole’s inequality.

I Introduction

Let us consider an optimization problem of the form,

[TABLE]

where $x\in X\subset\mathbb{R}^{n_{x}}$ is the vector of optimization variables, $J:X\to\mathbb{R}$ is the cost function, $\xi$ is a random vector with probability distribution $\mathcal{P}$ supported on $\Xi\subset\mathbb{R}^{n_{\xi}}$ , $F:X\times\Xi\to\mathbb{R}^{n_{m}}$ defines a set of constraints, $\mathbb{P}(A)$ denotes the probability of an event $A$ , and $\beta\in[0,1]$ defines a required confidence level of constraint satisfaction. Problems in this form, introduced in [1, 2, 3, 4], are typically called chance-constrained programming problems. Chance-constrained programming has a wide range of applications, e.g., in finance [5, 6], operational science [1, 7], and control [8, 9].

In general, problem (1) with $n_{m}\geq 2$ , called the joint chance-constrained programming (JCCP) problem, is difficult to solve. The major difficulty lies in that evaluations of $\mathbb{P}\big{(}F(x,\xi)\leq 0\big{)}$ involve numerical integrations of multivariate distributions, which are, in general, computationally intractable.

Tractable approaches to treat JCCP problems can be classified into two groups [10]: sampling-based approximations111Also in the names of scenario-based, simulation-based, Monte Carlo-based approximations. [11, 12, 13, 14, 15] and analytical safe approximations [16, 17, 18, 19, 20, 21]. The former approaches approximate a probability using a finite number of samples drawn from the distribution $\mathcal{P}$ . They can be applied for an arbitrary probability distribution $\mathcal{P}$ and constraint function $F$ , but have the following drawbacks: 1) They can at most provide a probabilistic guarantee of chance-constraint satisfaction as the approximation itself is random; and, 2) the required number of samples for the same level of probabilistic guarantee blows up as $\beta$ approaches one [8, 13]. The latter approaches, analytical safe approximations, typically exploit various probability inequalities [22] to derive a deterministically constrained problem whose feasible set is contained in the feasible set of the JCCP problem (1) so that the optimal feasible solution to the new problem is a suboptimal feasible solution to (1). Most popular choices include the exploitations of Boole’s inequality and Chebyshev-Markov type inequalities [18, 19, 20, 21]. Since inequalities are used, conservatism is introduced and the degree of conservatism substantially reflects the quality of an analytical safe approximation.

One of the most extensively investigated cases of JCCP problems is with $\mathcal{P}=\mathcal{N}$ , i.e., the random vector $\xi$ follows a normal distribution, and with $F$ being additively dependent on $\xi$ . Such a JCCP formulation has a broad range of applications, for instance, in the constrained control of linear Gaussian-Markov models [23, 24, 25, 19]. In this paper, we also focus on JCCP problems for such a case. We propose a new analytical safe approximation to such JCCP problems, which can be efficiently solved using standard nonlinear programming solvers without involving any numerical integrations or sampling-based probability approximations.

The notations used in this paper are standard. In particular, for a vector $\mu\in\mathbb{R}^{n}$ , $\mu_{j}$ represents its $j$ th element; for a matrix $M\in\mathbb{R}^{m\times n}$ , $M_{i}\in\mathbb{R}^{1\times n}$ represents its $i$ th row and $M_{ij}$ represents the entry in its $i$ th row, $j$ th column. We use $I_{n}$ to represent the $n\times n$ identity matrix. Also, we use iff to represent “if and only if” and i.i.d. stands for “independent and identically distributed.”

II Preliminaries

In this section, we list all of the preliminary lemmas that are used to prove the main result of this paper. Although some of the lemmas listed here may be well-known, we include them for the sake of completeness and point the reader to references for their proofs.

Lemma 1 (Spectral Theorem): Let $\Sigma\in\mathbb{R}^{n\times n}$ . Then, $\Sigma^{\top}=\Sigma$ (symmetric) iff there exists $\theta\in\mathbb{R}^{n\times n}$ such that $\theta^{\top}\theta=I_{n}$ (orthogonal) and $\theta^{\top}\Sigma\theta=\text{diag}\big{(}\lambda_{1},\cdots,\lambda_{n}\big{)}$ (diagonalizing $\Sigma$ ), where $\lambda_{i}\in\mathbb{R}$ for all $i=1,\cdots,n$ .

Proof: See Theorem 7.13 of [26]. $\blacksquare$

Lemma 2: Let $\phi\sim\mathcal{N}(\mu,\Sigma)$ and set $\psi=\theta^{\top}\phi$ , where the orthogonal matrix $\theta$ is such that $\theta^{\top}\Sigma\theta=\text{diag}\big{(}\lambda_{1},\cdots,\lambda_{n}\big{)}$ . Then, (i) $\psi\sim\mathcal{N}(\theta^{\top}\mu,\theta^{\top}\Sigma\theta)$ , and (ii) the components of $\psi$ are independent.

Proof: See Theorems 7.1 and 8.1 of [27]. $\blacksquare$

Lemma 3: Let $\big{(}\Omega,\mathcal{F},\mathbb{P}\big{)}$ be a probability space. Let $\alpha_{ij}:\Omega\to\mathbb{R}$ , $i=1,\cdots,n_{1}$ , $j=1,\cdots,n_{2}$ , be random variables, and $\gamma_{i}\in\mathbb{R}$ , $i=1,\cdots,n_{1}$ , be constants. Then,

[TABLE]

if there exist constants $\eta_{ij}\in\mathbb{R}$ , $i=1,\cdots,n_{1}$ , $j=1,\cdots,n_{2}$ , such that

[TABLE]

Proof: For any $\omega\in\Omega$ such that $\alpha_{ij}(\omega)\leq\eta_{ij}$ for all $i=1,\cdots,n_{1}$ and $j=1,\cdots,n_{2}$ , it holds that $\sum_{j=1}^{n_{2}}\alpha_{ij}(\omega)\leq\sum_{j=1}^{n_{2}}\eta_{ij}\leq\gamma_{i}$ for all $i=1,\cdots,n_{1}$ . Thus, $\bigcap_{i=1}^{n_{1}}\big{(}\bigcap_{j=1}^{n_{2}}(\alpha_{ij}\leq\eta_{ij})\big{)}\subset\bigcap_{i=1}^{n_{1}}\big{(}\sum_{j=1}^{n_{2}}\alpha_{ij}\leq\gamma_{i}\big{)}$ . Therefore, $\mathbb{P}\big{(}\bigcap_{i=1}^{n_{1}}(\sum_{j=1}^{n_{2}}\alpha_{ij}\leq\gamma_{i})\big{)}\geq\mathbb{P}\big{(}\bigcap_{i=1}^{n_{1}}(\bigcap_{j=1}^{n_{2}}(\alpha_{ij}\leq\eta_{ij}))\big{)}\geq\beta$ . $\blacksquare$

III Main result

We consider the following JCCP problem,

[TABLE]

where $x\in\mathbb{R}^{n_{x}}$ is the vector of optimization variables, $J:\mathbb{R}^{n_{x}}\to\mathbb{R}$ is a continuously differentiable function of $x$ , $\phi(x)$ is a random vector taking values in $\mathbb{R}^{n_{\phi}}$ , whose distribution depends on $x$ , the pair $(M,m)$ , $M\in\mathbb{R}^{n_{m}\times n_{\phi}}$ and $m\in\mathbb{R}^{n_{m}}$ , defines the constraint set, and $\beta\in[0,1)$ represents the required confidence level of constraint satisfaction. In particular, $\phi(x)\sim\mathcal{N}\big{(}\mu(x),\Sigma\big{)}$ , i.e., $\phi(x)$ is assumed to be distributed based on a multivariate normal distribution with mean $\mu(x)\in\mathbb{R}^{n_{\phi}}$ (as a continuously differentiable function of $x$ ) and covariance $\Sigma\in\mathbb{R}^{n_{\phi}\times n_{\phi}}$ ( $\Sigma^{\top}=\Sigma\succeq 0$ and independent of $x$ ).

Note that (4b) is equivalent to $\mathbb{P}\big{(}M\mu(x)+M(\phi(x)-\mu(x))\leq m\big{)}\geq\beta$ , where $\phi(x)-\mu(x)\sim\mathcal{N}\big{(}0,\Sigma\big{)}$ , i.e., a zero-mean additive Gaussian noise. Note also that, without loss of generality, $n_{m}\geq n_{\phi}$ , since otherwise we can redefine $\phi(x)\leftarrow M\phi(x)$ and $M\leftarrow I_{n_{m}}$ so that $n_{m}=n_{\phi}$ . We consider the form (4b) because the number of constraints, $n_{m}$ , can be much larger than the dimension of $\phi(x)$ , $n_{\phi}$ , in many problems, and the analytical safe approximation introduced in what follows involves a set of slack variables, whose number depends only on $n_{\phi}$ .

Theorem 1: Any feasible solution to the following deterministically constrained problem,

[TABLE]

is a feasible solution to the JCCP problem (4), where $\big{\{}\beta_{j}^{1},\beta_{j}^{2}\big{\}}_{j=1}^{n_{\phi}}$ are slack variables; $\overline{M}=M\theta$ , $\overline{\mu}(x)=\theta^{\top}\mu(x)$ , and $\lambda_{j}=\big{(}\theta^{\top}\Sigma\theta\big{)}_{jj}$ , in which $\theta\in\mathbb{R}^{n_{\phi}\times n_{\phi}}$ is such that $\theta^{\top}\theta=I_{n_{\phi}}$ and $\theta^{\top}\Sigma\theta=\text{diag}\big{(}\lambda_{1},\cdots,\lambda_{n_{\phi}}\big{)}$ ; and $\sigma_{ij}=1$ if $\overline{M}_{ij}\geq 0$ and $\sigma_{ij}=2$ if $\overline{M}_{ij}<0$ .

Proof: Set $\psi(x)=\theta^{\top}\phi(x)$ . By Lemma 2(i), $\psi(x)\sim\mathcal{N}\big{(}\overline{\mu}(x),\overline{\Sigma}\big{)}$ , where

[TABLE]

By Lemma 2(ii), the components of $\psi(x)$ , denoted by $\big{\{}\psi_{1}(x),\cdots,\psi_{n_{\phi}}(x)\big{\}}$ , are independent.

Using $\psi(x)$ , the chance constraint (4b) can be written as

[TABLE]

where $\overline{M}=M\theta$ .

By Lemma 3, if there exists a matrix $Z\in\mathbb{R}^{n_{m}\times n_{\phi}}$ such that

[TABLE]

then (7), and hence (4b), are satisfied.

Because $\big{\{}\psi_{1}(x),\cdots,\psi_{n_{\phi}}(x)\big{\}}$ are independent, the events $\big{\{}\bigcap_{i=1}^{n_{m}}(\overline{M}_{ij}\psi_{j}(x)\leq z_{ij})\big{\}}_{j=1}^{n_{\phi}}$ are independent. Thus, (8) can be written as

[TABLE]

which holds iff there exists a set of probability values $\big{\{}\beta_{1},\cdots,\beta_{n_{\phi}}\big{\}}$ such that

[TABLE]

For each $j$ , we categorize $\big{\{}\overline{M}_{ij}\big{\}}_{i=1}^{n_{m}}$ into three groups:

[TABLE]

Then, (11) can be written as222Here we assume $\lambda_{j}>0$ to simplify the exposition. Most generally, $\lambda_{j}\geq 0$ . The case $\lambda_{j}=0$ can be considered separately, which is straightforward. It will become clear that the derivations from (22) on, and hence the final expression (32), hold for $\lambda_{j}\geq 0$ .

[TABLE]

where $\frac{\psi_{j}(x)-\overline{\mu}_{j}(x)}{\sqrt{\lambda_{j}}}\sim\mathcal{N}(0,1)$ .

Then, (14) can be expressed using the cumulative distribution function of the standard normal distribution, $F(\zeta)=\mathbb{P}\big{(}z\leq\zeta\big{)}$ , $z\sim\mathcal{N}(0,1)$ , as

[TABLE]

where we have used the property $F(\zeta)=1-F(-\zeta)$ .

The constraint (16) holds iff there exist probability values $\beta_{j}^{1}$ and $\beta_{j}^{2}$ such that

[TABLE]

Using the inverse error function $\text{erf}^{-1}(\cdot)$ , (17) is almost surely equivalent to

[TABLE]

which is equivalent to the set of constraints

[TABLE]

for all $i\in I_{j}^{1}$ , where in restating (21) as (22) we have used the fact that $\overline{M}_{ij}>0$ for all $i\in I_{j}^{1}$ .

Similarly, (18) is equivalent to the set of constraints

[TABLE]

for all $i\in I_{j}^{2}$ . Note that $\overline{M}_{ij}<0$ for all $i\in I_{j}^{2}$ .

Combining the cases of $i\in I_{j}^{1}$ , $i\in I_{j}^{2}$ , and $i\in I_{j}^{3}$ , we obtain

[TABLE]

for all $i=1,\cdots,n_{m}$ , where $\sigma_{ij}=1$ if $\overline{M}_{ij}\geq 0$ and $\sigma_{ij}=2$ if $\overline{M}_{ij}<0$ .

Based on (6) to (24), we have shown that the joint chance constraint (4b) is satisfied if there exist a matrix $Z\in\mathbb{R}^{n_{m}\times n_{\phi}}$ and a set of probability values $\big{\{}\beta_{j},\beta_{j}^{1},\beta_{j}^{2}\big{\}}_{j=1}^{n_{\phi}}$ such that

[TABLE]

where $\sigma_{ij}=1$ if $\overline{M}_{ij}\geq 0$ and $\sigma_{ij}=2$ if $\overline{M}_{ij}<0$ .

Furthermore, the existence of $\big{\{}\beta_{j},\beta_{j}^{1},\beta_{j}^{2}\big{\}}_{j=1}^{n_{\phi}}$ satisfying (25) and (26) is equivalent to the existence of $\big{\{}\beta_{j}^{1},\beta_{j}^{2}\big{\}}_{j=1}^{n_{\phi}}$ satisfying

[TABLE]

so that the variables $\big{\{}\beta_{1},\cdots,\beta_{n_{\phi}}\big{\}}$ are dropped. In restating (25), (26) as (29), (30) we have used the fact that the variables $\big{\{}\beta_{1},\cdots,\beta_{n_{\phi}}\big{\}}$ are probability values, thus, must take non-negative values.

Similarly, (27) and (28) are equivalent to

[TABLE]

where $\sigma_{ij}=1$ if $\overline{M}_{ij}\geq 0$ and $\sigma_{ij}=2$ if $\overline{M}_{ij}<0$ , so that the variables $Z\in\mathbb{R}^{n_{m}\times n_{\phi}}$ are dropped.

To sum up, if $\big{(}x,\{\beta_{j}^{1},\beta_{j}^{2}\}_{j=1}^{n_{\phi}}\big{)}$ satisfies the following set of deterministic constraints,

[TABLE]

where $\sigma_{ij}=1$ if $\overline{M}_{ij}\geq 0$ and $\sigma_{ij}=2$ if $\overline{M}_{ij}<0$ , then $x$ satisfies the joint chance constraint

[TABLE]

Note that the constraints (32c) come from the fact that the variables $\big{\{}\beta_{j}^{1},\beta_{j}^{2}\big{\}}_{j=1}^{n_{\phi}}$ are probability values. This completes the proof. $\blacksquare$

Note that $\theta$ , $\overline{M}$ , $\big{\{}\lambda_{j}\big{\}}_{j=1}^{n_{\phi}}$ , and the set of indices $\big{\{}\sigma_{ij}\big{\}}_{i=1,\cdots,n_{m},\,j=1,\cdots,n_{\phi}}$ are independent of $x\in\mathbb{R}^{n_{x}}$ and $\big{\{}\beta_{j}^{1},\beta_{j}^{2}\big{\}}_{j=1}^{n_{\phi}}$ , and thus can be determined before solving the optimization problem (5).

The significance of Theorem 1 is as follows. On the one hand, the deterministically constrained problem (5) is an analytical safe approximation to the original joint chance-constrained problem (4). On the other hand, (5) is a standard nonlinear programming problem with continuously differentiable cost and constraint functions, which does not require numerical integrations or sampling-based probability approximations and can hence be solved using standard nonlinear programming solvers.

We note that the derivation of (5) relies on the diagonalization of a multivariate normal distribution in (6) and the exploitation of independent events in (8) $\implies$ (10). An alternative approach could be considered, which is to directly diagonalize $M\phi(x)$ as $M\phi(x)=\theta^{\prime}\psi^{\prime}(x)$ with $\psi^{\prime}(x)\sim\mathcal{N}\big{(}\mu^{\prime}(x),\text{diag}(\lambda_{1}^{\prime},\cdots,\lambda_{n_{m}}^{\prime})\big{)}$ so that (4b) can be written as $\mathbb{P}\big{(}\theta^{\prime}\psi^{\prime}(x)\leq m\big{)}\geq\beta$ . One may hope to work with $\mathbb{P}\big{(}\psi^{\prime}(x)\leq(\theta^{\prime})^{-1}m\big{)}=\prod_{j=1}^{n_{m}}\mathbb{P}\big{(}\psi_{j}^{\prime}(x)\leq((\theta^{\prime})^{-1}m)_{j}\big{)}\geq\beta$ . However, in general $\theta^{\prime}\psi^{\prime}(x)\leq m\mathrel{{\ooalign{$ \not\phantom{"} $\cr$ \iff $}}}\psi^{\prime}(x)\leq(\theta^{\prime})^{-1}m$ even if $\theta^{\prime}$ is orthogonal. A straightforward counterexample is $\begin{bmatrix}-\frac{\sqrt{3}}{2}&\frac{1}{2}\\ \frac{1}{2}&\frac{\sqrt{3}}{2}\end{bmatrix}\begin{bmatrix}1\\ -1\end{bmatrix}\leq\begin{bmatrix}0\\ 0\end{bmatrix}$ .

IV Risk allocation using Boole’s inequality

Two alternative approaches to obtain analytical safe approximations to the joint chance-constrained problem (4) that have been extensively studied in the literature are: 1) to separate the joint constraint into multiple elementary constraints and exploit Boole’s inequality [25, 19, 21], and 2) to ensure that the $\beta$ -level confidence ellipsoid of the random vector $\phi(x)$ is contained in the constraint admissible set [28, 29]. It is possible to show that, in general, the second method is more conservative than the first method [19, 30]. Thus, we choose to use the first method as a comparison benchmark. It is briefly reviewed in this section.

The constraint (4b) can be written as

[TABLE]

which holds if there exists a set of probability values $\big{\{}\beta_{1},\cdots,\beta_{n_{m}}\big{\}}$ such that

[TABLE]

where $\frac{M_{i}\,\phi(x)-M_{i}\,\mu(x)}{\sqrt{M_{i}\,\Sigma M_{i}^{\top}}}\sim\mathcal{N}(0,1)$ , which is based on Boole’s inequality:

[TABLE]

Using the inverse error function $\text{erf}^{-1}(\cdot)$ , (36) is almost surely equivalent to

[TABLE]

To sum up, any feasible solution to the following deterministically constrained problem,

[TABLE]

is a feasible solution to the JCCP problem (4), where $\{\beta_{i}\}_{i=1}^{n_{m}}$ are slack variables.

Note that, similar to (5), (40) is also a nonlinear programming problem with continuously differentiable cost and constraint functions.

V Application to constrained control of linear Gaussian-Markov models

In this section, we use examples representing the constrained control of linear Gaussian-Markov models to illustrate the effectiveness of the analytical safe approximation (5) to the joint chance-constrained programming problem (4).

Consider a discrete-time linear Gaussian-Markov model,

[TABLE]

where 1) $\overline{x}_{0}$ , $\Sigma_{x}$ , and $\Sigma_{w}$ are given, 2) $\{w_{t}\}_{t\in\mathbb{Z}_{\geq 0}}$ are i.i.d., and 3) $\{w_{t}\}_{t\in\mathbb{Z}_{\geq 0}}$ are independent of $x_{0}$ .

The control objective is to minimize a quadratic cost function,

[TABLE]

where $Q^{\top}=Q\succeq 0$ , $R^{\top}=R\succ 0$ , $N$ is the prediction horizon, and $const.$ represents constant terms independent of $\{u_{0},\cdots,u_{N-1}\}$ , subject to the joint chance constraint,

[TABLE]

We let $\phi=\begin{bmatrix}y_{1}^{\top}&\cdots&y_{N}^{\top}\end{bmatrix}^{\top}$ . The covariance of $y_{i}$ and $y_{j}$ , $i\leq j$ , is given by

[TABLE]

which is used to construct the covariance matrix $\Sigma$ in the JCCP formulation (4). Once $\Sigma$ is obtained, we solve a standard eigenvalue problem to obtain the eigenvalues $\{\lambda_{1},\cdots,\lambda_{n}\}$ and a set of corresponding orthonormal eigenvectors $\{\nu_{1},\cdots,\nu_{n}\}$ of $\Sigma$ , after which the orthogonal matrix $\theta$ is constructed by $\theta=\begin{bmatrix}\nu_{1},\cdots,\nu_{n}\end{bmatrix}$ .

After transforming the problem (V) and (43) into the form of (4), and further into its analytical safe approximation in the form of (5), we use the standard nonlinear programming solver Matlab $fmincon$ function with the interior-point method [31] to solve for $\{u_{0},\cdots,u_{N-1}\}$ .

To evaluate the effectiveness of our analytical safe approximation (5), once an optimal solution $\{u_{0},\cdots,u_{N-1}\}$ is obtained, we apply it to the open-loop system (41) where the disturbance signals $\{w_{0},\cdots,w_{N-1}\}$ are randomly created based on (41d), and repeat such a simulation for $10,000$ times.

We note that here we consider open-loop control. The use of the approach in the setting of receding-horizon optimal control, e.g., in stochastic model predictive control [9], to achieve closed-loop operation represents a natural extension, which is left as a topic to future research.

To compare the performance of our analytical safe approximation (5) and the one (40) based on the exploitation of Boole’s inequality, we also use (40) to solve for $\{u_{0},\cdots,u_{N-1}\}$ and run the same experiment.

Comparison results are based on the following two examples:

Example 1: We consider the following model representing the double mass-spring-damper system shown in Fig. 1,

[TABLE]

We discretize (V) using the Matlab $c2d$ function with sample time of $\Delta t=0.5$ [sec] to obtain the corresponding discrete-time model in the form of (41a). We consider

[TABLE]

and

[TABLE]

We test two cases: $\beta=0.6$ and $\beta=0.8$ . The responses of $y$ under the control input sequences $\{u_{0},\cdots,u_{N-1}\}$ solved based on (5) and (40) are shown in Fig. 2. It can be observed that, for both cases, $y_{1}$ and $y_{2}$ corresponding to the solutions of (5) get closer to the constraint boundaries compared to those corresponding to the solutions of (40).

We use two metrics to compare the relative degree of conservatism between (5) and (40). They are the cost values $J$ and the measured rates of constraint satisfaction $\overline{\beta}$ , i.e., the proportion of simulation runs where the constraint $\bigcap_{t=1}^{N}(y_{t}\leq y_{\max})$ is satisfied, corresponding to the solutions of (5) and (40). Note that since both solutions are feasible solutions to the original problem (V) and (43), their corresponding cost values reflect their relative degree of conservatism. The comparison results for both cases are summarized in Table I. It can be observed that, for both cases, the solution of (5) has a lower cost value and a measured rate of constraint satisfaction closer to the required value $\beta$ compared to the solution of (40).

Example 2: We consider the following model,

[TABLE]

which describes the short-period pitch attitude dynamics augmented by control actuator dynamics (elevator and flaperons) of an AFTI/F-16 aircraft at the flight condition of altitude $3000$ [feet] and Mach number $0.6$ . A continuous-time model is taken from [32] and discretized with sample time of $\Delta t=0.1$ [sec] to obtain the discrete-time model (50). Also, we consider

[TABLE]

and

[TABLE]

Similar to Example 1, we use the cost values $J$ and the measured rates of constraint satisfaction $\overline{\beta}$ corresponding to the solutions of (5) and (40) for different values of required confidence level $\beta\in[0.5,0.99]$ to compare their relative degree of conservatism. The comparison results are plotted in Fig. 3.

It can be observed that the solutions of (5) have significantly lower cost values than the solutions of (40). This can be explained by the fact that the measured rates of constraint satisfaction $\overline{\beta}$ corresponding to the solutions of (5) are much closer to the required values $\beta$ than those corresponding to the solutions of (40). As the solutions of (5) are capable of getting closer to the constraint boundaries, they lead to lower cost values.

Therefore, the above two examples illustrate that our analytical safe approximation (5) can have a considerably lower degree of conservatism than the analytical safe approximation (40) based on the exploitation of Boole’s inequality.

VI Discussions

There are two challenges faced by our analytical safe approximation (5) compared to the analytical safe approximation (40) based on the exploitation of Boole’s inequality: 1) Under additional assumptions, including that the cost function is convex in $x$ , the mean $\mu(x)$ is affine in $x$ , and the required confidence level satisfies $\beta\geq 0.5$ , it is possible to show that (40) is convex [19, 21]; on the contrary, (5) is not convex in general due to the constraint (5b). And, 2) the formulation (5) involves more slack variables than (40) when $2n_{\phi}>n_{m}$ . Both 1) and 2) result in higher computational complexity of (5) than (40). However, the computational effort in solving (5) can usually be manageable and be much less than that in solving the original JCCP problem (4) using an approach based on numerical or sampling-based integrations of multivariate distributions. For instance, the average and worst computation times for solving (5) of Example 2 are $4.933$ [sec] and $5.515$ [sec] using the Matlab $fmincon$ function with the interior-point method in uncompiled code on a PC with Intel Core i7-4790 3.60 GHz processor and 16.0 GB RAM. For comparison, the average and worst computation times for solving (40) of Example 2 in the same computation environment are $1.766$ [sec] and $2.666$ [sec].

VII Conclusions

In this paper, we proposed a new analytical safe approximation to joint chance-constrained programming problems with constraint functions additively dependent on normally-distributed random vectors. The approximation is a standard nonlinear program with continuously differentiable cost and constraint functions. Two examples representing the constrained control of linear Gaussian-Markov models were used to illustrate the effective application of our proposed analytical safe approximation and that our approximation can have a considerably lower degree of conservatism compared to a popularly used analytical safe approximation based on the exploitation of Boole’s inequality.

Bibliography32

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] A. Charnes, W. W. Cooper, and G. H. Symonds, “Cost horizons and certainty equivalents: an approach to stochastic programming of heating oil,” Management Science , vol. 4, no. 3, pp. 235–263, 1958.
2[2] B. L. Miller and H. M. Wagner, “Chance constrained programming with joint constraints,” Operations Research , vol. 13, no. 6, pp. 930–945, 1965.
3[3] A. Prekopa, “On probabilistic constrained programming,” in Proceedings of the Princeton symposium on mathematical programming , vol. 113. Princeton, NJ, 1970, p. 138.
4[4] J. R. Birge and F. Louveaux, Introduction to stochastic programming . Springer Science & Business Media, 2011.
5[5] D. Dentcheva, B. Lai, and A. Ruszczyński, “Dual methods for probabilistic optimization problems,” Mathematical Methods of Operations Research , vol. 60, no. 2, pp. 331–346, 2004.
6[6] B. K. Pagnoncelli, S. Ahmed, and A. Shapiro, “Computational study of a chance constrained portfolio selection problem,” J. Optim. Theory Appl , vol. 142, no. 2, pp. 399–416, 2009.
7[7] S. Talluri, R. Narasimhan, and A. Nair, “Vendor performance with supply risk: A chance-constrained DEA approach,” International Journal of Production Economics , vol. 100, no. 2, pp. 212–222, 2006.
8[8] G. C. Calafiore and M. C. Campi, “The scenario approach to robust control design,” IEEE Transactions on Automatic Control , vol. 51, no. 5, pp. 742–753, 2006.