Computing Probabilistic Controlled Invariant Sets
Yulong Gao, Karl H. Johansson, and Lihua Xie

TL;DR
This paper introduces probabilistic controlled invariant sets (PCISs) for stochastic control systems, providing algorithms for their computation in discrete and continuous spaces, with applications demonstrated through motion planning simulations.
Contribution
It proposes finite- and infinite-horizon PCISs, explores their relation to robust invariant sets, and develops computational algorithms for practical control system applications.
Findings
Algorithms for PCIS computation are computationally tractable.
Finite-horizon PCISs converge with space discretization.
Infinite-horizon PCISs relate to stochastic backward reachable sets.
Abstract
This paper investigates stochastic invariance for control systems through probabilistic controlled invariant sets (PCISs). As a natural complement to robust controlled invariant sets~(RCISs), we propose finite- and infinite-horizon PCISs, and explore their relation to RICSs. We design iterative algorithms to compute the PCIS within a given set. For systems with discrete spaces, the computations of the finite- and infinite-horizon PCISs at each iteration are based on linear programming and mixed integer linear programming, respectively. The algorithms are computationally tractable and terminate in a finite number of steps. For systems with continuous spaces, we show how to discretize the spaces and prove the convergence of the approximation when computing the finite-horizon PCISs. In addition, it is shown that an infinite-horizon PCIS can be computed by the stochastic backward reachable…
| System | Invariant Set | Control | Horizon | Computation | |
| This paper | Markov controlled process | PCIS | Yes | Finite and infinite horizons | Iteration based on stochastic backward reachable set |
| [15] | Nonlinear stochastic system | PCIS | Yes | Finite and infinite horizons | No |
| [16] | Linear stochastic system | PCIS | Yes | One step | Ellipsoidal approximation |
| [17] | Linear stochastic system | Probabilistic invariant set | No | Infinite horizon | Polyhedral approximation based on Chebyshev’s inequality |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Control Systems Optimization · Control Systems and Identification · Fault Detection and Control Systems
Computing Probabilistic Controlled Invariant Sets
Yulong Gao, Karl H. Johansson, Fellow, IEEE and Lihua Xie, Fellow, IEEE This work of Y. Gao and K. H. Johansson is supported by the Knut and Alice Wallenberg Foundation, the Swedish Strategic Research Foundation, and the Swedish Research Council.Y. Gao and K. H. Johansson are with Division of Decision and Control Systems, KTH Royal Institute of Technology, Stockholm 10044, Sweden [email protected], [email protected]. Gao and L. Xie are with School of Electrical and Electronic Engineering, Nanyang Technological University, 639798, Singapore [email protected], [email protected]
Abstract
This paper investigates stochastic invariance for control systems through probabilistic controlled invariant sets (PCISs). As a natural complement to robust controlled invariant sets (RCISs), we propose finite- and infinite-horizon PCISs, and explore their relation to RICSs. We design iterative algorithms to compute the PCIS within a given set. For systems with discrete spaces, the computations of the finite- and infinite-horizon PCISs at each iteration are based on linear programming and mixed integer linear programming, respectively. The algorithms are computationally tractable and terminate in a finite number of steps. For systems with continuous spaces, we show how to discretize the spaces and prove the convergence of the approximation when computing the finite-horizon PCISs. In addition, it is shown that an infinite-horizon PCIS can be computed by the stochastic backward reachable set from the RCIS contained in it. These PCIS algorithms are applicable to practical control systems. Simulations are given to illustrate the effectiveness of the theoretical results for motion planning.
Index Terms:
stochastic control systems, reachability analysis, probabilistic controlled invariant set (PCIS)
I Introduction
I-A Motivation and Related Work
Invariance is a fundamental concept in systems and control [1, 2, 3]. A controlled invariant set captures the region where the states can be maintained by some admissible control inputs. Robust controlled invariant sets (RCISs) are defined for control systems with bounded external disturbances and address the invariance despite any realization of the disturbances. In the past decades, there have been lots of research results on RCISs and their computations [4, 5, 6]. This paper studies probabilistic controlled invariant sets (PCISs), which is a natural complement to RCISs suitable in many applications. A PCIS is a set within which the controller is able to keep the system state with a certain probability. Such sets not only alleviate the inherent conservatism of RCISs by allowing probabilistic violations but also enlarge the applications of RCISs by being able to address unbounded disturbances. The study of PCISs is motivated by safety-critical control [7], stochastic model predictive control (MPC) [8, 9], reliable control [10, 11], and relevant applications, e.g., air traffic management systems [12, 13] and motion planning [14].
A question at the heart of this paper is
Given a set and a parameter , how to compute a set that is invariant with probability ?
To the best of our knowledge, this question has not been explored up to now. One essential component in iterative approaches on computing RCISs is to compute the robust backward reachable set, in which each state can be steered to the current set by an admissible input for all possible uncertainties [4, 5, 6]. The PCIS computation in this paper follows the same idea, but the robust backward reachable set is replaced with the stochastic backward reachable sets which require different mathematical tools. Some challenges related to such an approach should be highlighted: (i) how to make it tractable to compute the stochastic backward reachable set, in particular for systems with continuous spaces; (ii) how to mitigate the conservatism when characterizing the stochastic backward reachable set subject to the prescribed probability; (iii) how to guarantee convergence of the iterations.
Controlled invariant sets have recently been extended to stochastic systems. In [18], a target set, which is similar to the PCIS of this paper, is used to define stabilization in probability. In [10], a reliable control set, another similar notion to a PCIS, is used to guarantee the reliability of Markov-jump linear systems. The reliability is further studied for such systems with bounded disturbances in [11]. A definition of PCIS for nonlinear systems is provided in [15] by using reachability analysis. It is later applied to portfolio optimization [19]. Another definition of probabilistic invariance originates from stochastic MPC [16] and captures one-step invariance. In [16], an ellipsoidal approximation is given for linear systems with specific uncertainty structure. Similar invariant sets are used in [20] to construct a convex lifting function for linear stochastic control systems. A definition of a probabilistic invariant set is proposed in [17, 21] for linear stochastic systems without control inputs. This definition captures the probabilistic inclusion of the state at each time instant. A recent work [22] explores the correspondence between probabilistic and robust invariant sets for linear systems. In [17, 21], polyhedral probabilistic invariant sets are approximated by using Chebyshev’s inequality for linear systems with Gaussian noise. Recursive satisfaction is usually computationally intractable for general stochastic control systems.
The results of this paper build on the above work but make significant additions and improvements. Table I summarizes the comparison between our work and the most relevant literature. (i) All the above references focus on some specific stochastic systems (e.g., linear or one-dimensional affine nonlinear systems) or on some specific class of stochastic disturbances (e.g., Gaussian or state-independent noise). In our model, we consider general Markov controlled processes, which include general system dynamics and stochastic disturbances. (ii) Different from [17, 21], our invariant sets are defined based on trajectory inclusion as in [15] and, particularly, incorporate control inputs constrained by a compact set. An accompanying question is how to find an admissible control input when verifying or computing a PCIS. (iii) The PCISs in this paper are different from the maximal probabilistic safe sets in [23]. Every trajectory in a PCIS is required by our definition to admit the same probability level, which does not hold for the maximal probabilistic safe set. (vi) The stochastic reachability analysis studied in [23] provides an important tool for maximizing the probability of staying in a set. Based on this, we compute a PCIS within a set with a prescribed probability level. This extends the results of [15, 23, 24].
I-B Main Contributions and Organization
The objective of this paper is to provide a novel tool to analyze invariance in stochastic control systems. The contributions are summarized as follows.
As the first contribution, we propose two novel definitions of PCIS: -step -PCIS and infinite-horizon -PCIS (Definitions 3 and 4). An -step -PCIS is a set within which the state can stay for steps with probability under some admissible controller while an infinite-horizon -PCIS is a set within which the state can stay forever with probability under some admissible controller. These invariant sets are different from the ones proposed in [16, 17], which address probabilistic set invariance at each time step. Our definitions are applicable for general discrete-time stochastic control systems. We provide fundamental properties of PCISs and explore their relation to RCISs. Furthermore, we propose conditions for the existence of infinite-horizon -PCIS (Theorem 3).
The second contribution is that we design iterative algorithms to compute the largest finite- and infinite-horizon PCIS within a given set for systems with discrete and continuous spaces. The PCIS computation is based on the stochastic backward reachable set. For discrete state and control spaces, it is shown that at each iteration, the stochastic backward reachable set computation of an -step -PCIS can be reformulated as a linear program (LP) (Theorem 1 and Corollary 1) and an infinite-horizon -PCIS as a computationally tractable mixed-integer linear program (MILP) (Theorem 4). Furthermore, we prove that these algorithms terminate in a finite number of steps. For continuous state and control spaces, we present a discretization procedure. Under weaker assumptions than [25], we prove the convergence of such approximations for -step -PCISs (Theorem 2). The approximations generalize the case in [23], which only discretizes the state space for a given discrete control space. Furthermore, in order to compute an infinite-horizon -PCIS, we propose an algorithm based on that an infinite-horizon PCIS always contains an RCIS.
The remainder of the paper is organized as follows. Section II provides the system model and some preliminaries. Section III presents the definition, properties, and computation algorithms of finite-horizon PCISs. Section IV extends the results to the infinite-horizon case. Examples in Section V illustrate the effectiveness of our approach. Section VI concludes this paper.
Notation. Let denote the set of nonnegative integers and the set of real numbers. For some and , let and denote the sets and , respectively. For two sets and , and . When , , , and are applied to vectors, they are interpreted element-wise. denotes the probability. For a set , and denote the Boreal -algebra generated by and the space of probability distributions on , respectively. The indicator function of a set is denoted by , that is, if , and otherwise, .
II System Description and Preliminaries
Consider a stochastic control system described by a Markov controlled process , where
- •
is a state space endowed with a Borel -algebra ;
- •
is a compact control space endowed with a Borel -algebra ;
- •
is a Borel-measurable stochastic kernel given , which assigns to each and a probability measure on the Borel space : .
Let us denote by the set of the admissible control actions for each . Assume that is nonempty for each .
Consider a finite horizon . A policy is said to be a Markov policy if the control inputs are only dependent on the current state, i.e., .
Definition 1
(Markov Policy) A Markov policy for system is a sequence of universally measurable maps
[TABLE]
Remark 1
Given a space , a subset in this space is universally measurable if it is measurable with respect to every complete probability measure on that measures all Borel sets in . A function is universally measurable if is universally measurable in for every . As stated in [23, 26], the condition of universal measurability is weaker than the condition of Borel measurability for showing the existence of a solution to a stochastic optimal problem. Roughly speaking, this is because the projections of measurable sets are analytic sets and analytic sets are universally measurable but not always Borel measurable [26, 27].
Remark 2
For a large class of stochastic optimal control problems, Markov policies are sufficient to characterize the optimal policy [26]. Furthermore, since a randomized Markov policy does not increase the largest probability that the states remain in a set, we focus on deterministic Markov policies in the following.
We denote the set of Markov policies as . Consider a set . Given an initial state and a Markov policy , an execution is a sequence of states . Introduce the probability with which the state will remain within for all :
[TABLE]
Let , . We call the -step invariance probability at in the set . Following the dynamic program (DP) in [23], define the value function , by the backward recursion:
[TABLE]
with initialization .
Assumption 1
The set
[TABLE]
is compact for all , , and .
Lemma 1
[23]** For all , . If Assumption 1 holds, the optimal Markov policy exists and is given by
[TABLE]
Extending the finite horizon to infinite horizon, we need to introduce stationary Markov policies.
Definition 2
(Stationary Markov Policy) A Markov policy is said to be stationary if with universally measurable.
Given an initial state and a stationary Markov policy , an execution is denoted by a sequence of states . We introduce the probability with which the state will remain within for all :
[TABLE]
Denote . We call the infinite-horizon invariance probability at in the set . Define the value function , through the forward recursion:
[TABLE]
initialized with .
Assumption 2
There exists a such that the set
[TABLE]
is compact for all , , and .
Lemma 2
[23]** Suppose that Assumption 2 holds. Then, for all , the limit exists and satisfies
[TABLE]
and . Furthermore, an optimal stationary Markov policy exists and is given by
[TABLE]
In the following two sections, we explore finite- and infinite-horizon PCISs and how to compute them.
III Finite-Horizon -PCIS
In this section, we first define finite-horizon -PCIS for the system and provide the properties of this set. Then, we explore how to compute the finite-horizon -PCIS within a given set.
Definition 3
(-step -PCIS) Consider a stochastic control system . Given a confidence level , a set is an -step -PCIS for if for any , there exists at least one Markov policy such that .
We define the stochastic backward reachable set by collecting all the states at which the -step invariance probability , i.e.,
[TABLE]
If , it yields from that is also Borel-measurable. If , the following lemma addresses the measurability of the set .
Lemma 3
For any , the set is universally measurable.
Proof:
See Appendix A. ∎
Let us denote by the set of all probability measures on . The following proposition shows that despite of the universal measurability of , for any probability measure on , one can find another Borel-measurable set for which the difference to is measure-zero.
Proposition 1
For any and any , there exists a set with such that .
Proof:
It follows from the universal measurability of as shown in Lemma 3, the Borel measurability of , , and Lemma 7.26 in [26]. ∎
From Lemma 1 and the definition of , we can verify whether a set is an -step -PCIS or not by checking if either , or , , where is defined in (1).
Remark 3
The stochastic backward reachable set is called the maximal probabilistic safe set in [23]. The -step -PCIS in Definition 3 refines the maximal probabilistic safe set by requiring that for any initial state , the -step invariance probability is no less than .
In the following, we show that finite-horizon PCISs are closed under union.
Proposition 2
Consider a collection of sets , . If each is an -step -PCIS for the same system , then the union is an -step -PCIS, where and .
Proof:
The result follows from the following two facts:
(i) for any with , , and ;
(ii) for any with , , and . ∎
III-A Finite-horizon -PCIS computation
This subsection will address the following problem.
Problem 1
Given a set and a prescribed probability , compute an -step -PCIS .
To handle this problem, our basic idea is to iteratively compute stochastic backward reachable sets until convergence. A general procedure is presented in the following algorithm.
In Algorithm 1, we first compute the stochastic backward reachable set within and then update to be the corresponding Borel-measurable set , which is tailored by picking up a such that (see Proposition 1). The following theorem shows convergence of . The terminal condition guarantees that the resulting set by this algorithm is an -step -PCIS .
Theorem 1
Let Assumption 1 hold. For any , Algorithm converges, i.e., exists. If , it is the largest -step -PCIS within .
Proof:
From Algorithm and Lemma 1, we have that if the termination condition does not hold, . It follows that the sequence is nonincreasing. Then,
[TABLE]
which suggests the existence of . Furthermore, if is nonempty, we conclude that it is the largest -step PCIS within based on the fixed-point theory. ∎
To facilitate the practical implementation of Algorithm 1, we need to address two important properties: the computational tractability of , , and the finite-step convergence of Algorithm 1. In the following, we will derive these two properties for discrete and continuous spaces, respectively. It is shown that if the spaces are discrete, the properties are guaranteed and in particular at each iteration we only need to solve an LP to compute the exact value of . If the spaces are continuous, we will design a discretization algorithm with convergence guarantee, which enables us to preserve the above two properties.
III-A1 Discrete state and control spaces
If the state and control spaces are discrete, i.e., they are finite sets, the stochastic kernel denotes the transition probability from state to state under control action , which satisfies that , and .
In this case, according to Theorem 1 of [28], we can exactly compute via an LP. Moreover, the existence of the optimal Markov policy can be always guaranteed.
Lemma 4
Given any set , the value functions in (1) can be obtained by solving an LP:
[TABLE]
which gives , and , where is the optimal solution of (4). The optimal Markov policy is given by where is such that
[TABLE]
Proof:
See Theorem 1 in [28] for the proof. ∎
Corollary 1
For discrete state and control spaces, Algorithm 1 converges in a finite number of iterations. Furthermore, at each iteration, the -step invariance probability , , can be computed via the LP (4) and the corresponding optimal policy is determined by (5).
Proof:
The finite-step convergence of Algorithm 1 follows from Theorem 1 and the finite cardinality of . The remaining part follows from Lemma 4. ∎
Remark 4
When implementing Algorithm 1 to a system with discrete spaces, the maximal number of iterations is . At each iteration, an LP is solved to compute the value of , . The number of the decision values in the LP is at most and the number of constraints is at most . It follows from [29] that Algorithm 1 can be implemented in time.
III-A2 Continous state and control action spaces
In order to preserve the computational tractability of and the finite-step convergence of Algorithm 1, if the state and control spaces are both continuous, we first discretize the spaces with convergence guarantee. Then, we adapt Algorithm 1 to compute an approximate -step -PCIS within a given set.
Assume that and for some . For simplicity, we use Euclidean metric for the spaces and . For any , we define where denotes the Lebesgue measure of sets. We suppose that the stochastic kernel admits a density , which represents the probability density of given the current state and the control action .
Now we consider Problem 1, where we assume that the given set is compact, which implies that is bounded. We further suppose that the density function satisfies the following assumption.
Assumption 3
There exists a constant such that for any , and ,
[TABLE]
Discretization
We discretize the compact set into pair-wise disjoint nonempty Borel sets , , i.e., . We pick a representative state from each set , denoted by . Let , , and .
Similarly, the compact control space is divided into pair-wise disjoint nonempty Borel sets , , i.e., . We pick a representative element from the set , denoted by . Let , , and .
Let the grid size be a constant . For each , define the set of admissible discrete control actions as
[TABLE]
where is the representative state of to which belongs, i.e., if . Following [25], the following lemma shows that each has a nonempty admissible discretized control set.
Lemma 5
For each , the set is nonempty and , .
Proof:
Since the admissible control set is nonempty, , there exists such that , . Hence, by the definition of , we have that the set is nonempty for each . Furthermore, from (6), it is easy to obtain that , . ∎
As in [25], let us define the function
[TABLE]
From (7), we observe that all states enjoy the same stochastic kernel. An approximate stochastic control system is given by a triple . Here the transition probability is defined by , where with and , and .
Approximation of PCISs
For the approximate system , the discretized version of the DP (1) is given by
[TABLE]
For each , . We define the discretized optimal Markov policy as
[TABLE]
For each , .
Remark 5
Since the state and control action spaces of the approximated system are finite, the value of can be computed via the LP (4) and the corresponding optimal policy can be determined by (5). In addition, all the states in each share the same approximate -step invariance probability and optimal policy as the representative state .
Lemma 6
Under Assumptions 1 and 3, the functions and satisfy that ,
[TABLE]
where
[TABLE]
Proof:
See Appendix B. ∎
Remark 6
Lemma 6 guarantees convergence as the grid size tends to zero and generalizes the case considered in [23], which only discretizes the state space for a given finite control space. To prove Lemma 6, we need to show that (i) the value functions in (1) are Lipschitz continuous (Lemma 8), which is similar to Theorem 8 in [23], and (ii) the difference between the approximate density function and the original density function is bounded (Lemma 9), which is different from that in [23].
Theorem 2
Let Assumptions 1 and 3 hold. Consider a compact set and a corresponding discretized set of . If is an -step -PCIS for the approximate system , and , the set is an -step -PCIS for the system , where .
Proof:
According to the construction of the discretized system , we have that , and , . Since is an -step -PCIS, it follows that , . By Lemma 6 and triangular inequality, we have
[TABLE]
Then, when , we conclude that the set is an -step -PCIS where . ∎
Remark 7
From Theorem 2, if , by choosing a suitable grid size , the problem of computing an -step -PCIS within for can be transformed into that of computing an approximate -step -PCIS with probability for .
Computation algorithm
Assume that a probability level is given. After discretizing the set and the control space , we modify Algorithm 1 to compute an -step -PCIS , as shown in the following.
In Algorithm 2, we first construct an approximate system with grid size . Then, following similar steps as in Algorithm 1, we compute the stochastic backward reachable set iteratively for the system . At each iteration, an LP is solved to obtain the -step invariance probability. One difference is that the stochastic backward reachable set is computed with respect to and the updated set for the system is the union of the subsets of corresponding to the stochastic backward reachable set. By Theorem 2, the resulting set by Algorithm 2 is an -step -PCIS.
Corollary 2
Let Assumptions 1 and 3 hold. For continuous state and control spaces, Algorithm 2 converges in a finite number of iterations and generates an -step -PCIS. Furthermore, at each iteration, the -step invariance probability , , can be computed via the LP (4) and the corresponding optimal policy is determined by (5).
Proof:
By Theorem 2 and the Borel measurability of the subsets , it follows that the set generated by Algorithm 2 is an -step -PCIS. The remaining part is similar to the proof of Corollary 1. ∎
Remark 8
When implementing Algorithm 2 to a system with continuous spaces, it follows from [29] that Algorithm 2 can be implemented in time, cf. Remark 4.
IV Extension to Infinite-horizon -PCIS
Now let us extend finite-horizon -PCISs to infinite-horizon -PCISs. In this section, we define the infinite-horizon -PCIS and explore the conditions of its existence. Furthermore, we provide algorithms to compute an infinite-horizon -PCIS within a given set.
Definition 4
(Infinite-horizon PCIS) Consider a stochastic control system . Given a confidence level , a set is an infinite-horizon -PCIS for if for any , there exists at least one stationary Markov policy such that .
We define the stochastic backward reachable set by collecting all the states at which the infinite-horizon invariance probability , i.e.,
[TABLE]
For the infinite-horizon case, Lemma 3 and Proposition 1 still hold. That is, the set is universally measurable and for any , there exists another Borel-measurable set such that .
Under Assumption 2, by Lemma 2 and the definition of , we can verify whether a set is an infinite-horizon -PCIS or not by checking if either , or , , where is defined by (2)–(3).
Definition 5
Consider a stochastic control system . An RCIS for is an -step -PCIS with and .
Remark 9
Another interpretation of RCIS in Definition 5 is that a set is an RCIS if for any , there exists at least one control input such that . It is easy to verify that an RCIS is also an infinite-horizon -PCIS with . It is called an absorbing set in [30] where there is no control input. In the following, we show that the RCIS plays an important role in the existence of infinite-horizon PCIS and provide how to design an algorithm to compute such PCIS based on RCIS.
Remark 10
Note that infinite-horizon -PCISs are also closed under union, as shown in Proposition 2 when is replaced by .
IV-A Existence of infinite-horizon PCIS
Intuitively, the monotone decrease of may imply that the value of is one or zero. However, it is possible to get in some cases (see Examples 1 and 2 in Section V). The following theorem provides necessary conditions and sufficient conditions for the existence of infinite-horizon -PCIS with .
Theorem 3
Suppose that Assumption 2 holds and let be fixed. Given a nonempty set , let be the control input such that (3) holds for each . The set is an infinite-horizon -PCIS
- (i)
only if* there exists an RCIS such that ,*
[TABLE]
where ;
- (ii)
*if *there exists an RCIS such that ,
[TABLE]
Proof:
See Appendix C. ∎
Remark 11
The value of is the largest probability that the next state remains outside the RCIS from any under the optimal stationary Markov policy in Lemma 2. Note that is the gap between the necessary condition and the sufficient condition. In addition, the second item in ((i))–(11) denotes the probability that the state is steered into the RCIS by two transitions from with an intermediate state outside .
Corollary 3
Suppose that Assumption 2 holds and let be fixed. A nonempty set is an infinite-horizon -PCIS
- (i)
only if* there exists an RCIS such that , for some ;*
- (ii)
*if *there exists an RCIS such that , for some .
Proof:
See Appendix D. ∎
Remark 12
A nonempty set is an infinite-horizon -PCIS if there exists an RCIS such that , for some . This implication will facilitate the design of an algorithm for an infinite-horizon -PCIS, see Algorithm 4.
Remark 13
Considering the similarity between the reliability defined in [11] and the infinite-horizon invariance probability in this paper, we can extend the results on infinite-horizon PICSs, including the existence condition above and the computational algorithms in the following, to the reliable control set in [10] to general stochastic systems.
IV-B Infinite-horizon -PCIS computation
This subsection will address the following problem.
Problem 2
Given a set and a prescribed probability , compute an infinite-horizon -PCIS .
To handle this problem, the key point is to compute the infinite-horizon invariance probability . For discrete spaces, it is shown that computationally tractable MILP can be used to compute the exact value of . In this case, we can compute the largest infinite-horizon -PCIS by computing iteratively the stochastic backward reachable sets until convergence. For continuous spaces, it is in general computationally intractable to compute and the discretization method fails to work since the approximation error in (8) increases with the horizon. In this case, we design another computational algorithm based on the sufficient conditions in Remark 12.
IV-B1 Discrete state and control spaces
If the state and control spaces are discrete, we adopt the same assumptions as in Section III-A1. We will first show how to compute the exact value of in (2)–(3) through an MILP. Then, we will adapt Algorithm 1 to compute the largest infinite-horizon -PCIS within a given set.
MILP reformulation
Since [math] is a trivial solution of (3), we cannot directly reformulate (2)–(3) as an LP, which is the traditional way to deal with infinite-horizon stochastic optimal control problems [31].
The following lemma provides a computationally tractable MILP reformulation when computing .
Lemma 7
Given any set , the value of in (3) can be obtained by solving the MILP:
[TABLE]
where is a constant greater than one. That is, , , where is the optimal solution of the MILP (12). The optimal stationary Markov policy is where such that and is the optimal solution of the MILP (12).
Proof:
From the monotone decrease of the sequence and Lemma 2, is the maximum fixed point satisfying (3). Hence, the equivalent form of can be written as MILP (12), where the constraints (12b)–(12d) guarantee that there exists such that the equality in (3) holds. ∎
Computational algorithm
As an adaption of Algorithm 1, the following algorithm provides a way to compute the largest infinite-horizon -PCIS within .
The difference between Algorithms and is that the value of , instead of , , is computed by (12) (replacing with ). Furthermore, the updated set , which is a stochastic backward reachable set within with respect to infinite horizon and a probability level . The following theorem provides the convergence of and shows that the resulting set by this algorithm is an infinite-horizon -PCIS.
Theorem 4
For discrete state and control spaces, Algorithm 3 converges in a finite number of iterations and generates the largest infinite-horizon -PCIS within . Furthermore, at each iteration, the infinite-horizon invariance probability , , can be computed via the MILP (12).
Proof:
The finite-step convergence of Algorithm 3 follows from the finite cardinality of the set . Similar to Theorem 1, the generated infinite-horizon -PCIS is the largest one within . The MILP reformulation refers to Lemma 7. ∎
Remark 14
When implementing Algorithm 3 to a system with discrete spaces, the maximal iteration number is . An MILP is used to compute the value of , , at each iteration. The number of real-valued decision values is at most , the number of binary decision values is at most , and the number of constraints is at most . In general, MILPs are NP-hard and can be solved by cutting plane algorithm or branch-and-bound algorithm [32]. Some advanced softwares have been developed to solve large MILPs efficiently [33, 34].
IV-B2 Continuous state and control spaces
If the state and control spaces are continuous, it is computationally intractable to compute the exact value of infinite-horizon invarinace probability . Based on Remark 12, this subsection provides another way to compute an infinite-horizon -PCIS within a given set .
Different from Algorithm 3, which computes iteratively the stochastic backward reachable sets, the following algorithm generates an infinite-horizon -PCIS by computing a backward stochastic reachable set from the RCIS contained in .
The first step in Algorithm 4 is the computation of RCIS within a given set, which is a well-studied topic in the literature [4, 5, 6]. Then, based on RCIS within , the stochastic backward reachable set
[TABLE]
is an infinite-horizon -PCIS within . In comparision with Algorithms 1–3, the iteration is avoided in Algorithm 4, which only needs two steps.
Remark 15
Note that the resulting set from Algorithm 4 is in general not the largest infinite-horizon -PCIS within the given set . It is possible to obtain a larger infinite-horizon -PCIS if we can reformulate the existence conditions in Theorem 3 and Corollary 3 in a recursive form and thereby modify Algorithm 4 to be a recursive algorithm.
Remark 16
The complexity of Algorithm 4 depends on the computation of the RCIS [3, 4, 5, 6], and the computation of the backward stochastic reachable set. The later can be reformulated as a chance-constrained problem and then approximately solved. Some results on computation of the backward stochastic reachable set have been reported in [35]. The first example in Section V will show how to compute the backward stochastic reachable set.
V Examples
In this section, two examples are provided to illustrate the effectiveness of the proposed theoretical results. The first one is concerned with comparison between PCIS and RCIS. Then we consider an application to motion planning of a mobile robot in a partitioned space with obstacles.
V-A Example 1: Comparison between PCIS and RCIS
Consider the following example from [36]:
[TABLE]
where A=\left[\begin{array}[]{ccccc}1.6&1.1\\ -0.7&1.2\end{array}\right] and B=\left[\begin{array}[]{ccccc}1\\ 1\end{array}\right]. The control input is constrained by . We consider to be either non-stochastic or stochastic when computing RCIS and PCIS, respectively. The region of interest is . We will compare the largest RCIS and PCIS within .
To derive an RCIS for this system, we assume the disturbance belongs to the compact set . By using the methods in [1, 6], we obtain the largest RCIS, which is the blue region shown in Fig. 1. The gray region is an infinite-horizon -PCIS described in the end of this example.
When computing a finite-horzion PCIS, assume that elements of are i.i.d. Gaussian random variables with zero mean and variance . This system can be represented as a triple :
[TABLE]
where is the density function of the standard normal distribution and . In this case, since the Lipschitz constant in Assumption 3 is small, we ignore the approximation error in (9). We discretize the continuous spaces and implement Algorithm to compute the -step -PCIS . First consider and . Fig. 2(a) shows the evolution of the set in Algorithm 2. The color indicates the corresponding -step invariance probability and the -axes the iteration index . The algorithm converges in steps. Fig. 2(b) shows , which corresponds to the -step -PCIS for and .
When computing an infinite-horizon PCIS, we choose the same bound on the disturbance as for the RCIS. The elements of are truncated i.i.d. Gaussian random variables with zero mean and variance . Denote the largest RCIS computed above by , where the matrix and the vector are with appropriate dimensions. As stated in Algorithm , the one-step stochastic backward reachable set from the RCIS associated with probability is an infinite-horizon -PCIS with , i.e.,
[TABLE]
This set can be represented as
[TABLE]
where is the optimal solution of the chance constrained program
[TABLE]
This program can be numerically solved by using the methods in [37, 38]. The resulting infinite-horizon -PCIS with is the gray region shown in Fig. 1. This region is obviously a superset of the RCIS in blue.
V-B Example 2: Motion planning
The motion planning example in [39] is adapted to seek an infinite-horizon PCIS within the workspace for a mobile robot. The state of the robot is abstracted by its cell coordinate, i.e., , and its four possible orientations . Due to the actuation noise and drifting, the robot motion is stochastic. Here, we restrict the action space to be , under which the possible transitions are shown in Fig. 3. Specifically, action “” means driving forward for unit. As illustrated in the figure, the probability for that is . The probability of drifting forward to the left or the right by unit is . Action “” can be similarly defined. Action “” means turning right and driving forward for unit, of which the probability is . The probability of driving forward for unit without turning right is and the probability of turning right for and driving forward for unit is . Similarly, we can define the action “”.
Consider the partitioned workspace shown in Fig. 4, where the shadowed cells are occupied by obstacles and the red cell is an absorbing region, i.e., when the robot enters in this region it will stay there forever. We construct an MDP with states and actions. The transition relation and probability can be defined based on the above description. We compute the largest infinite-horizon -PCIS with within the safe state space, i.e., the remaining of the state space by excluding the states associated with the obstacles.
By implementing Algorithm , the computed sets and the corresponding infinite-horizon invariance probability are shown in Fig. 5, of which each subfigure corresponds to one orientation in . The first row of Fig. 5 shows the results after the first iteration, where we can see that the infinite-horizon invariance probability at and is less than . Algorithm converges in steps and generates the largest infinite-horizon -PCIS with shown in Fig. 5(e)–5(h). This invariant set provides a region where the admissible action can drive the robot without colliding with the obstacles with probability . By implementing the optimal policy obtained in Lemma 7, we run a state trajectory starting from as shown in Fig. 4. We can see that this trajectory is collision-free and finally ends at the absorbing region .
VI Conclusion
We investigated the extension of set invariance in a stochastic sense for control systems. We proposed finite- and infinite-horizon -PCISs, and provided some fundamental properties. We designed iterative algorithms to compute the PCIS within a given set. For systems with discrete state and control spaces, finite- and infinite-horizon -PCISs can be computed by solving an LP and an MILP at each iteration, respectively. We proved that the iterative algorithms were computationally tractable and can be terminated in a finite number of steps. For systems with continuous state and control spaces, we established the approximation of stochastic control systems and proved its convergence when computing finite-horizon -PCIS. In addition, thanks to the sufficient conditions for the existence of infinite-horizon -PCIS, we can compute an infinite-horizon -PCIS by the stochastic backward reachable set from the RCIS contained in it. Numerical examples were given to illustrate the theoretical results.
One future direction is to apply the PCISs to safety-critical control and stochastic predictive control. In particular, how to characterize stability using PCISs is an important problem to consider. Another interesting future extension of PCISs is to study reliability and mean-time-to-failure for general stochastic systems.
Acknowledgment
The authors are grateful to Prof. Alessandro Abate for helpful discussions and feedback and to anonymous reviewers for their constructive comments.
Appendix A. Proof of Lemma 3
Define the functions , , as
[TABLE]
As shown in [23], the function is lower-semianalytic for any . From Definitions 7.20 and 7.21 in [26], we have that the function is also analytically measurable and thus is universally measurable for any . According to the definition of universal measurability, the set for is universally measurable.
Recall the definition of the stochastic backward reachable set , we have that
[TABLE]
where . Thus, the set is universally measurable for any .
Appendix B. Proof of Lemma 6
Before proving Lemma 6, we need two auxiliary lemmas. Lemma 8 shows that the value functions in (1) are Lipschitz continuous. It is adapted from Theorem 8 in [23]. Lemma 9 shows that the difference between the approximate density function and the original density function is bounded.
Lemma 8
Under Assumptions 1 and 3, for any , the value functions in (1) satisfy
[TABLE]
Proof:
Similar to Theorem 8 in [23]. ∎
Lemma 9
Under Assumptions 3, for all and ,
[TABLE]
Proof:
If , it follows from Assumption 3 that
[TABLE]
And if , we first have
[TABLE]
Furthermore, we have
[TABLE]
This completes the proof. ∎
Proof of Lemma 6: First of all, let us prove the inequality (8). It is easy to check it for since . By induction, we assume that , . For any , , we define and . According to the dicretization procedure of the control space, we can choose some such that . Then, we have that
[TABLE]
and
[TABLE]
Thus, we have
[TABLE]
For any , , it follows that
[TABLE]
which completes the proof of the inequality (8).
Appendix C. Proof of Theorem 3
Let be the control input such that (3) holds for any .
Only-if-part: Under Assumption 2, the fact that the set is an infinite-horizon -PCIS is equivalent to . Let . Under Assumption 2, exists for all . The set collects all the states for which the value of is maximal over the set . Extending Lemma 3 to infinite-horizon case, we have that the set is universally measurable. By Lemma 7.16 in [26], we have that for any , there exists a Borel-measurable set such that .
Next we will show that the set is an RCIS. It follows from Assumption 2 and Lemma 2 that ,
[TABLE]
where Eq. (14) follows from and Eq. (15) follows from that . Furthermore, since , and , the equality in Eq. (15) holds if and only if and thereby . Based on the recursion in (2), we have . Hence, the set is an RCIS.
Next let us prove that , Eq.((i)) holds. That is to prove that
[TABLE]
By Theorem 7 in [23], the control input is also optimal to the recursion (2). For all , we have , and ,
[TABLE]
Let . Note that . Then, , we can follow the induction rule to prove that
[TABLE]
which by taking limitation yields that (16) holds.
If-part: The proof for the existence of an RCIS is the same as that of the only if part. As shown above, the condition is equivalent to . We can use induction to prove that ,
[TABLE]
which further implies that . One sufficient condition to guarantee is (11), i.e., . The proof is completed.
Appendix D. Proof of Corollary 3
By Lemma 2 and Theorem 3, the necessary condition in Corollary 3 can be proven by showing that , there exists a such that
[TABLE]
where Eq. (17) follows from .
The sufficient condition in Corollary 3 can be proven by showing that , there exists a
[TABLE]
where Eq. (18) follows from . One sufficient condition to guarantee is . The proof is completed.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] D. Bertsekas, “Infinite time reachability of state-space regions by using feedback control,” IEEE Transactions on Automatic Control , vol. 17, no. 5, pp. 604–613, 1972.
- 2[2] F. Blanchini, “Set invariance in control,” Automatica , vol. 35, no. 11, pp. 1747–1767, 1999.
- 3[3] F. Blanchini and S. Miani, Set-theoretic methods in control . Springer, 2007.
- 4[4] S. V. Raković, E. C. Kerrigan, K. I. Kouramas, and D. Q. Mayne, “Invariant approximations of the minimal robust positively invariant set,” IEEE Transactions on Automatic Control , vol. 50, no. 3, pp. 406–410, 2005.
- 5[5] M. Rungger and P. Tabuada, “Computing robust controlled invariant sets of linear systems,” IEEE Transactions on Automatic Control , vol. 62, no. 7, pp. 3665–3670, 2017.
- 6[6] E. Gilbert and K. T. Tan, “Linear systems with state and control constraints: the theory and practice of maximal admissible sets,” IEEE Transactions on Automatic Control , vol. 36, no. 9, pp. 1008–1020, 1991.
- 7[7] I. M. Mitchell, S. Kaynama, M. Chen, and M. Oishi, “Safety preserving control synthesis for sampled data systems,” Nonlinear Analysis: Hybrid Systems , vol. 10, pp. 63–82, 2013.
- 8[8] A. Mesbah, “Stochastic model predictive control: an overview and perspectives for future research,” IEEE Control Systems , vol. 36, no. 6, pp. 30–44, 2016.
