Computing the projected reachable set of switched affine systems: an   application to systems biology

Francesca Parise; Maria Elena Valcher; John Lygeros

arXiv:1705.00400·cs.SY·May 2, 2017

Computing the projected reachable set of switched affine systems: an application to systems biology

Francesca Parise, Maria Elena Valcher, John Lygeros

PDF

Open Access

TL;DR

This paper introduces a novel method for approximating the reachable set of switched affine systems, enabling better analysis of biochemical networks and optimizing external signals to reach desired states.

Contribution

It presents a new approach to approximate the reachable set of switched affine systems and demonstrates its application to systems biology for improved accuracy.

Findings

01

More accurate estimates of protein mean and variance reachable sets.

02

Method efficiently computes projections without full set enumeration.

03

Validated results with experimental data.

Abstract

A fundamental question in systems biology is what combinations of mean and variance of the species present in a stochastic biochemical reaction network are attainable by perturbing the system with an external signal. To address this question, we show that the moments evolution in any generic network can be either approximated or, under suitable assumptions, computed exactly as the solution of a switched affine system. Motivated by this application, we propose a new method to approximate the reachable set of switched affine systems. A remarkable feature of our approach is that it allows one to easily compute projections of the reachable set for pairs of moments of interest, without requiring the computation of the full reachable set, which can be prohibitive for large networks. As a second contribution, we also show how to select the external signal in order to maximize the probability…

Figures12

Click any figure to enlarge with its caption.

a). This work was supported by the SNSF grant number P2EZP2 168812.

Equations184

\overset{x}{˙} (t) = f (x (t), σ (t)), t \geq 0,

\overset{x}{˙} (t) = f (x (t), σ (t)), t \geq 0,

R_{T} (x_{0}) := {x \in R^{n} ∣ \exists σ \in S : x = x (T; x_{0}, σ)} .

R_{T} (x_{0}) := {x \in R^{n} ∣ \exists σ \in S : x = x (T; x_{0}, σ)} .

v_{T} (c) := x \in R_{T} (x_{0}) max c^{⊤} x,

v_{T} (c) := x \in R_{T} (x_{0}) max c^{⊤} x,

H_{T} (c)

H_{T} (c)

H_{T} (c)

H_{T} (c)

R_{T}^{o u t} (x_{0}) := \cap_{d = 1}^{D} H_{T} (c^{d})

R_{T}^{o u t} (x_{0}) := \cap_{d = 1}^{D} H_{T} (c^{d})

x_{T}^{⋆} (c^{d})

x_{T}^{⋆} (c^{d})

R_{T}^{in} (x_{0}) := conv ({x_{T}^{⋆} (c^{d}), d = 1, 2, \dots, D})

R_{T}^{in} (x_{0}) := conv ({x_{T}^{⋆} (c^{d}), d = 1, 2, \dots, D})

y (t) = Lx (t),

y (t) = Lx (t),

R_{T}^{y} (x_{0}) := {y \in R^{p} ∣ \exists x \in R_{T} (x_{0}) : y = Lx} .

R_{T}^{y} (x_{0}) := {y \in R^{p} ∣ \exists x \in R_{T} (x_{0}) : y = Lx} .

y (t) = Lx (t) = [l_{1}^{⊤} x (t) l_{2}^{⊤} x (t)] \in R^{2},

y (t) = Lx (t) = [l_{1}^{⊤} x (t) l_{2}^{⊤} x (t)] \in R^{2},

H_{T}^{y} (γ^{d})

H_{T}^{y} (γ^{d})

R_{T}^{y, o u t} (x_{0}) := \cap_{d = 1}^{D} H_{T}^{y} (γ^{d})

R_{T}^{y, o u t} (x_{0}) := \cap_{d = 1}^{D} H_{T}^{y} (γ^{d})

R_{T}^{y, in} (x_{0}) := conv ({y_{T}^{⋆} (γ^{d}), d = 1, 2, \dots, D})

R_{T}^{y, in} (x_{0}) := conv ({y_{T}^{⋆} (γ^{d}), d = 1, 2, \dots, D})

\overset{x}{ˉ} \in H_{T} (c^{d}) \Leftrightarrow (c^{d})^{⊤} \overset{x}{ˉ} \leq v_{T} (c^{d}) \Leftrightarrow

\overset{x}{ˉ} \in H_{T} (c^{d}) \Leftrightarrow (c^{d})^{⊤} \overset{x}{ˉ} \leq v_{T} (c^{d}) \Leftrightarrow

\Leftrightarrow (l_{2} - γ^{d} l_{1})^{⊤} \overset{x}{ˉ} \leq v_{T} (c^{d}) \Leftrightarrow l_{2}^{⊤} \overset{x}{ˉ} \leq γ^{d} l_{1}^{⊤} \overset{x}{ˉ} + v_{T} (c^{d}) .

v_{T} (c) := σ \in S max

v_{T} (c) := σ \in S max

\overset{x}{˙} (t) = f (x (t), σ (t)), \forall t \in [0, T],

x (0) = x_{0} .

\overset{x}{˙} (t) = A x (t) + B σ (t),

\overset{x}{˙} (t) = A x (t) + B σ (t),

σ_{r}^{⋆} (t) :

σ_{r}^{⋆} (t) :

v_{T} (c)

v_{T} (c)

x_{T}^{⋆} (c)

x_{T}^{⋆} (c)

\overset{x}{˙} (t) = A_{σ (t)} x (t) + b_{σ (t)},

\overset{x}{˙} (t) = A_{σ (t)} x (t) + b_{σ (t)},

S_{I}^{K} := {σ ∣ σ (t) = i_{k} \in N [1, I], \forall t \in [t_{k}, t_{k + 1}), k \in N [0, K]} .

S_{I}^{K} := {σ ∣ σ (t) = i_{k} \in N [1, I], \forall t \in [t_{k}, t_{k + 1}), k \in N [0, K]} .

v_{T} (c) = x_{k}, z_{i}^{k}, γ_{i}^{k} max

v_{T} (c) = x_{k}, z_{i}^{k}, γ_{i}^{k} max

z_{i}^{k + 1} \leq (\overset{ˉ}{A}_{i}^{k} x_{k} + \overset{ˉ}{b}_{i}^{k}) + M (1 - γ_{i}^{k}),

z_{i}^{k + 1} \geq (\overset{ˉ}{A}_{i}^{k} x_{k} + \overset{ˉ}{b}_{i}^{k}) - M (1 - γ_{i}^{k}),

z_{i}^{k + 1} \geq - M γ_{i}^{k}, z_{i}^{k + 1} \leq M γ_{i}^{k},

z_{i}^{k} \in R^{n}, \forall k \in N [1, K + 1], \forall i \in N [1, I],

γ_{i}^{k} \in {0, 1}, \forall k \in N [0, K], \forall i \in N [1, I],

x_{k} = \sum_{i = 1}^{I} z_{i}^{k} \in R^{n}, \forall k \in N [1, K + 1],

\sum_{i = 1}^{I} γ_{i}^{k} = 1, \forall k \in N [0, K],

x_{0} \in R^{n} assigned.

v_{T} (c) = i_{k} \in {1, \dots, I} max

v_{T} (c) = i_{k} \in {1, \dots, I} max

x_{k + 1} = \overset{ˉ}{A}_{i_{k}}^{k} x_{k} + \overset{ˉ}{b}_{i_{k}}^{k} \forall k \in N [0, K]

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGene Regulatory Network Analysis · Microbial Metabolic Engineering and Bioproduction · Bioinformatics and Genomic Networks

Full text

Computing the projected reachable set of switched affine systems: an application to systems biology

Francesca Parise, Maria Elena Valcher and John Lygeros F. Parise is with the Laboratory for Information and Decision Systems, MIT, Cambridge, MA: [email protected], M.E. Valcher is with the Department of Information Engineering, University of Padova, Italy: [email protected] and J. Lygeros is with the Automatic Control Laboratory, ETH, Zurich, Switzerland: [email protected]. We thank M. Khammash for allowing us to perform the experiments in the CTSB Laboratory, ETH, and J. Ruess and A.M. Argeitis for their help in collecting the data of Fig. 5a). This work was supported by the SNSF grant number P2EZP2 168812.

Abstract

A fundamental question in systems biology is what combinations of mean and variance of the species present in a stochastic biochemical reaction network are attainable by perturbing the system with an external signal. To address this question, we show that the moments evolution in any generic network can be either approximated or, under suitable assumptions, computed exactly as the solution of a switched affine system. Motivated by this application, we propose a new method to approximate the reachable set of switched affine systems. A remarkable feature of our approach is that it allows one to easily compute projections of the reachable set for pairs of moments of interest, without requiring the computation of the full reachable set, which can be prohibitive for large networks. As a second contribution, we also show how to select the external signal in order to maximize the probability of reaching a target set. To illustrate the method we study a renown model of controlled gene expression and we derive estimates of the reachable set, for the protein mean and variance, that are more accurate than those available in the literature and consistent with experimental data.

I Introduction

One of the most impressive results achieved by synthetic biology in the last decade is the introduction of externally controllable modules in biochemical reaction networks. These are biochemical circuits that react to external signals, as for example light pulses [1, 2, 3] or concentration signals [4, 5], allowing researchers to influence and possibly control the behavior of cells in vivo. To fully exploit these tools, it is important to first understand what range of behaviors they can exhibit under different choices of the external signal. For deterministic systems, this amounts to computing the set of states that can be reached by the controlled system trajectories starting from a known initial configuration [6, 7]. Since chemical species are often present in low copy numbers inside the cell, biochemical reaction networks can however be inherently stochastic [8]. In other words, if we apply the same signal to a population of identical cells, then every cell will have a different evolution (with different likelihood), requiring a probabilistic analysis.

If we interpret each cell has an independent realization, we can then study the effect of the external signal on a population of cells by characterizing how such a signal influences the moments of the underlying stochastic process. Specifically, in this paper we pose the following question:

“What combinations of moments of the stochastic process can be achieved by applying the external signal?”

This approach is motivated for example by biotechnology applications, where one would like to control the average behavior of the cells in large populations, instead of each cell individually. More on the theoretical side, this perspective can be useful to investigate fundamental questions on noise suppression in biochemical reaction networks, as in [9].

The cornerstone of our approach is the observation that while the number of copies in each cell is stochastic, the evolution of the moments is deterministic and can either be described or approximated by a switched affine system. Consequently, the above question can be reformulated as a reachability problem in the moment space. Computing the exact reachable set of a switched affine system is in general far from trivial, see [10, 11]. We thus start our analysis by proposing a new method to approximate the reachable set of a switched affine system. This is an extension of the hyperplane method for linear systems suggested in [12] and is of interest on its own. We then show how to apply the proposed approach to biochemical reaction networks by distinguishing two cases:

If all the reactions follow the laws of mass action kinetics and are at most of order one, the system of moments equations is switched affine. Consequently, for this class of networks, the above question can be solved by directly applying the newly suggested hyperplane method in the moments space; 2. 2.

For all other reaction networks the moments equations are in general non-closed (i.e., the evolution of mean and variance depends on higher order moments). We show however that the evolution of the probability of being in a given state can be described by an infinite dimensional switched system and that the desired moments can be computed as the output of such system. We then show: i) How to approximate such an infinite dimensional system with a finite dimensional one, by extending the finite state projection method [13] to controllable networks, ii) How to compute the reachable set of the finite dimensional system by applying the newly suggested hyperplane method in the probability space, and iii) How to recover an approximation of the original reachable set from the reachable set of the finite dimensional system.

In the last part of the paper, we change perspective and, instead of focusing on population properties, we consider the behaviour of a single cell (i.e., a single realization of the process), given a fixed initial condition or an initial probability distribution. Such perspective has been commonly employed for the case without external signals, see e.g. [14, 15, 16, 13]. Our objective is to show how the external signal can be used to control single cell realizations by posing the following question

“What external signal should be applied to maximize the probability that the cell trajectory reaches a prespecified subset of the state space at the end of the experiment?”

We show that such a problem can be addressed by using similar tools as those derived for the population analysis.

Comparison with the literature

A vast literature has been devoted to the analysis of the reachable set of piecewise-affine systems in the context of hybrid systems, see e.g. [10, 17, 18, 19, 20, 21, 22] among many. Our results are different because we exploit the specific structure of the problem at hand, that is, the fact that the switching signal is a control variable and that the dynamics in each mode are autonomous and affine. In other words, we consider switched affine systems for which the switching signal is the only control action. We also note that many different methods have been proposed in the literature to compute the reachable set of generic nonlinear systems. Among these there are level set methods [23], ellipsoidal methods [24] and sensitivity based methods [25]. For example, we became aware at the time of submission that the authors of [26] extended our previous works [27, 28] by suggesting the use of ellipsoidal methods. It is important to stress that the choice of a method that scales well with the system size is essential in our context, since biochemical networks are typically very large. Moreover, biologists are often interested in analyzing the behavior of only a few chemical species of the possibly many involved in the network. Consequently, one is usually interested in computing the projection of the reachable set (which is a high-dimensional object) on some low-dimensional space of interest. The hyperplane method that we propose stands out in this respect since, by using a method tailored for switched systems, it allows one to compute directly the projections of the reachable set, without requiring the computation of the full high-dimensional reachable set first. We thus avoide the curse of dimensionality that characterises all the previously mentioned methods. We note that part of the results of this paper appeared in our previous works [28, 29]. Specifically, in [28] we first suggest the use of the hyperplane method to compute the reachable set of biochemical networks with linear moment equations, which we then adapted in [28] to the case of switched affine moment equations. As better detailed in Section IV-A, the assumptions made both in [28] and [29] do not allow for bimolecular reactions, which are instead present in the vast majority of biochemical networks. The key contribution of this paper is the generalisation of our analysis to any biochemical network by using the approach described in point 2) above. The analysis of single cell realizations is also entirely new.

Outline

In Section II we present the hyperplane method. In Section III-A we review how to compute the hyperplane constants for linear systems, while in Section III-B we propose a new procedure for switched affine systems. In Section IV we introduce stochastic biochemical reaction networks and the controlled chemical master equation (CME). Additionally, we recap how to derive the moments equations from the CME (Section IV-A) and we derive an extension of the finite state projection method to controlled biochemical networks (Section IV-B). In Section V we show how to compute the reachable set of biochemical networks and in Section VI we derive the results on single cell realizations. Section VII illustrates our theoretical results on a gene expression case study.

Notation

Given $a<b\in\mathbb{N}$ , we set $\mathbb{N}[a,b]:=\{a,a+1,\ldots,b\}$ . Given a set $\mathcal{S}$ , the symbol $\partial\mathcal{S}$ denotes its boundary, $\textup{conv}(\mathcal{S})$ its convex hull and $|\mathcal{S}|$ its cardinality. For a vector $x\in\mathbb{R}^{n}$ , $x_{p}:=\left[x\right]_{p}$ denotes its $p$ th component, $|x|:=[|x_{1}|^{\top},\ldots,|x_{n}|^{\top}]^{\top}$ and $\|x\|_{\infty}:=\max_{p=1,2,\dots,n}|x_{p}|$ denotes the infinity norm. $\mathbbm{1}$ denotes a vector of all ones. Given two random variables $Z_{1},Z_{2}$ , we denote by $\mathbb{V}[Z_{1}]$ and $\mathbb{V}[Z_{1},Z_{2}]$ their variance and covariance, respectively.

II Reachability tools

II-A The reachable set and the hyperplane method

Consider the $n$ -dimensional nonlinear control system

[TABLE]

where $x$ is the $n$ -dimensional state and $\sigma$ the $m$ -dimensional input function. Set a final time $T>0$ and let ${\mathcal{S}}$ be *the set of admissible input functions * that we assume to be a subset of the set of all measurable functions that map $[0,T]$ into $\mathbb{R}^{m}$ . We assume that the function $f:\mathbb{R}^{n}\times\mathbb{R}^{m}\rightarrow\mathbb{R}^{n}$ is such that, for every initial condition $x(0)\in\mathbb{R}^{n}$ and every input function $\sigma\in\mathcal{S}$ , the solution of (1), denoted by $x(t;x(0),\sigma),t\geq 0,$ is well defined and unique at every time $t\geq 0$ . The reachable set of system (1) at time $T$ is defined as the set of all states $x\in\mathbb{R}^{n}$ that can be reached at time $T$ , starting from $x(0)$ , by using an admissible input function $\sigma\in{\mathcal{S}}$ .

Definition 1 (Reachable set at time $T$ ).

The reachable set at time $T>0$ from $x(0)=x_{0}$ , for system (1) with admissible input set ${\mathcal{S}}$ , is

[TABLE]

From now on we will assume that the set $\mathcal{R}_{T}(x_{0})$ is compact, since this will be the case for all the systems of interest analysed in the following. Computing such a reachable set for nonlinear systems is in general a very difficult task. For the case of linear systems with bounded inputs a method to construct an outer approximation of $\mathcal{R}_{T}(x_{0})$ as the intersection of a family of half-spaces that are tangent to its boundary (see Fig. 1) was proposed in [12].

We present here a generalisation of this method to system (1). For a given direction $c\in\mathbb{R}^{n}$ , let us define

[TABLE]

where, for simplicity, we omitted the dependence of $v_{T}(c)$ on the initial condition $x_{0}$ . Let

[TABLE]

be the corresponding hyperplane. By definition of the constant $v_{T}(c)$ , the associated half-space

[TABLE]

is a superset of $\mathcal{R}_{T}(x_{0}).$ We note that if $\partial\mathcal{R}_{T}(x_{0})$ is smooth, then $\textup{H}_{T}(c)$ is the tangent plane to $\partial\mathcal{R}_{T}(x_{0})$ . By evaluating the above hyperplanes and half-spaces for various directions, one can construct an outer approximation of the reachable set, as illustrated in the next theorem. If the reachable set is convex then an inner approximation can also be derived.

Theorem 1 (The hyperplane method [12]).

Given system (1), an initial condition $x_{0}\in{\mathbb{R}}^{n}$ , a fixed time $T>0$ , an integer number $D\geq 2$ , and a set of $D$ directions $\mathcal{C}:=\{c^{1},\ldots,c^{D}\}$ , define the half-spaces $\mathcal{H}_{T}(c^{d})$ as in (5), for $d=1,\ldots,D$ .

The set

[TABLE]

is an outer approximation of the reachable set $\mathcal{R}_{T}(x_{0})$ at time $T$ starting from $x_{0}$ . 2. 2.

If the set $\mathcal{R}_{T}(x_{0})$ is convex and for each $d=1,2,\dots,D,$ we select a (tangent) point

[TABLE]

then the set

[TABLE]

is an inner approximation of the reachable set $\mathcal{R}_{T}(x_{0})$ at time $T$ starting from $x_{0}$ . ∎

Remark 1.

We note that by construction the outer approximation $\mathcal{R}_{T}^{out}(x_{0})$ is a convex object. Specifically, when the number of hyperplanes tends to infinity $\mathcal{R}_{T}^{out}(x_{0})$ coincides with the convex hull of $\mathcal{R}_{T}(x_{0})$ . Similarly, for any set $\mathcal{R}_{T}(x_{0})$ , the set $\mathcal{R}_{T}^{in}(x_{0})$ is an inner approximation of the convex hull of $\mathcal{R}_{T}(x_{0})$ . However, the inner approximation of the convex hull of a set is an inner approximation of the set itself only if such set is convex, as assumed in the previous theorem. ∎

The main advantage of this method is that hyperplanes are very easy objects to handle and visualise. The main disadvantage is that the higher the dimension $n$ of the state space, the higher in general is the number of directions $D$ required to obtain a good characterisation of the reachable set. In the next subsection we show how to avoid this curse of dimensionality, in cases when only the projection of the reachable set on a plane of interest is needed.

II-B The output reachable set

Let the output of system (1) be

[TABLE]

for $L\in\mathbb{R}^{p\times n}$ , and the output reachable set be the set of all output values that can be generated at time $T$ from $x(0)=x_{0}$ , by using an admissible input function $\sigma\in{\mathcal{S}}$ .

Definition 2 (Output reachable set at time $T$ ).

The output reachable set $\mathcal{R}_{T}^{y}(x_{0})$ from $x_{0}$ at time $T>0$ , for system (1) with admissible input set ${\mathcal{S}}$ and output as in (7), is

[TABLE]

For simplicity, in the following we restrict our discussion to the case of a two-dimentional output vector, that is

[TABLE]

for some $l_{1},l_{2}\in\mathbb{R}^{n}$ , the generalization to higher dimentions is however immediate. Note that, for any pair of indices $i,j\in\{1,\ldots,n\},i\neq j$ , one can recover the projection of the reachable set $\mathcal{R}_{T}(x_{0})$ onto an $(x_{i},x_{j})$ -plane of interest by imposing $l_{1}=e_{i}$ and $l_{2}=e_{j}$ . The two-dimentional output vector case can therefore be applied to study the relation between the mean behavior of two species or between mean and variance of a single species in large biochemical networks.

In the following theorem we show that inner and outer approximations of $\mathcal{R}_{T}^{y}(x_{0})$ can be efficiently computed by selecting only hyperplanes that are perpendicular to the plane of interest.

Theorem 2 (Projection on a two dimensional subspace).

Consider system (1), with output (8) and initial condition $x_{0}\in{\mathbb{R}}^{n}$ . Let $T>0$ be a fixed time, $D\geq 2$ an integer number and choose $D$ values $\gamma^{d}\in\mathbb{R}$ . Set $c^{d}:=l_{2}-\gamma^{d}l_{1}\in\mathbb{R}^{n}$ and

[TABLE]

where $v_{T}(c^{d})$ is as in (3). Set $y_{T}^{\star}(\gamma^{d})\textstyle:=Lx_{T}^{\star}(c^{d}),$ where $x_{T}^{\star}(c^{d})$ is defined as in (6). Then the set

[TABLE]

is an outer approximation of $\mathcal{R}_{T}^{y}(x_{0})$ . Moreover, if $\mathcal{R}_{T}(x_{0})$ is convex then the set

[TABLE]

is an inner approximation of $\mathcal{R}_{T}^{y}(x_{0})$ . ∎

Proof.

By definition, for any $\bar{y}\in\mathcal{R}_{T}^{y}(x_{0})$ there exists an $\bar{x}\in\mathcal{R}_{T}(x_{0})$ such that $\bar{y}^{\top}=[l_{1}^{\top}\bar{x},l_{2}^{\top}\bar{x}]$ . By Theorem 1, for any direction $c^{d}$ it holds that $\mathcal{R}_{T}(x_{0})\subset\mathcal{H}_{T}(c^{d})$ . Consequently, $\bar{x}\in\mathcal{R}_{T}(x_{0})$ implies $\bar{x}\in\mathcal{H}_{T}(c^{d})$ . By substituting the definition of $c^{d}$ given in the statement we get

[TABLE]

The last inequality implies $\bar{y}^{\top}=[l_{1}^{\top}\bar{x},l_{2}^{\top}\bar{x}]\in\mathcal{H}^{y}_{T}(\gamma^{d})$ . Consequently, $\mathcal{R}_{T}^{y}(x_{0})\subseteq\mathcal{H}^{y}_{T}(\gamma^{d})$ for any $\gamma^{d}$ and therefore $\mathcal{R}_{T}^{y}(x_{0})\subseteq\mathcal{R}^{y,out}_{T}(x_{0})$ . If $\mathcal{R}_{T}(x_{0})$ is convex, then $\mathcal{R}_{T}^{y}(x_{0})$ is convex as well. The points $y^{\star}_{T}(\gamma^{d})$ belong to $\mathcal{R}_{T}^{y}(x_{0})$ by construction. Consequently, by convexity, it must hold that $\mathcal{R}^{y,in}_{T}(x_{0})\subseteq\mathcal{R}_{T}^{y}(x_{0})$ . ∎

III Computing the tangent hyperplanes

The success of the hyperplane method hinges on the possibility of efficiently evaluating, for any given direction $c$ , the constant $v_{T}(c)$ in (3). Note that this problem is equivalent to the following finite time optimal control problem

[TABLE]

In the rest of this section, we aim at solving (11). To this end, we start by recalling the linear case, for which the hyperplane method was originally derived in [12].

III-A Linear systems with bounded input

The hyperplane method was originally proposed for linear systems with bounded inputs

[TABLE]

where $x(t)\in\mathbb{R}^{n}$ , $A\in\mathbb{R}^{n\times n}$ , $B\in\mathbb{R}^{n\times m}$ and $\sigma(t)\in\mathbb{R}^{m}$ . Since biological signals are non-negative and bounded, we here make the following assumption on the input set $\mathcal{S}$ .

Assumption 1.

The input function $\sigma$ belongs to the admissible set $\mathcal{S}_{\Sigma}:=\{\sigma\mid\sigma(t)\in\Sigma,\forall t\in[0,T]\},$ where $\Sigma=\Sigma_{1}\times\ldots\times\Sigma_{m}$ . Moreover, there exist $\bar{\sigma}_{r}>0,r\in\mathbb{N}[1,m],$ such that either (a) every set $\Sigma_{r}$ is the interval $\Sigma^{c}_{r}:=[0,\bar{\sigma}_{r}]$ (continuous and bounded input set), or (b) for every set $\Sigma_{r}$ there exists $2\leq q_{r}<+\infty$ such that $\Sigma^{d}_{r}:=\left\{0=\sigma_{r}^{1}<\sigma_{r}^{2}<\ldots<\sigma_{r}^{q_{r}}=\bar{\sigma}_{r}\right\}\subset\mathbb{R}_{\geq 0}$ (finite input set). We set $\Sigma^{c}:=\Sigma^{c}_{1}\times\ldots\times\Sigma^{c}_{m}$ , $\Sigma^{d}:=\Sigma^{d}_{1}\times\ldots\times\Sigma^{d}_{m}$ , and denote by $\mathcal{S}_{\Sigma^{c}}$ and $\mathcal{S}_{\Sigma^{d}}$ the corresponding admissible sets.

In the case of a continuous and bounded input set, i.e. under Assumption 1-(a), it was shown in [12] that it is possible to solve the control problem in (11) in closed form by using the Maximum Principle [30].

Proposition 1 (Tangent hyperplanes for linear systems with bounded and continuous inputs).

Consider system (12) and suppose that Assumption 1-(a) holds. Define the following admissible input function, expressed component-wise for every $r$ th entry, $r=1,\ldots,m$ , as

[TABLE]

where $b_{r}$ denotes the $r$ th column of $B$ . Then

[TABLE]

where $[g(t)]_{+}$ denotes the positive part of the function, namely $[g(t)]_{+}=g(t)$ when $g(t)>0$ and zero otherwise. Suppose additionally that the pair $(A,b_{r})$ is reachable, for every $r\in\mathbb{N}[1,m].$ Then there exists no interval $[\tau_{1},\tau_{2}]$ , with $0\leq\tau_{1}<\tau_{2}\leq T$ , such that $c^{\top}e^{A(T-t)}b_{r}=0$ for every $t\in[\tau_{1},\tau_{2}]$ . Consequently, a tangent point can be obtained as

[TABLE]

$\square$ **

The proof follows the same lines as [12, Lemma 2.1 and Theorem 2.1] and is omitted for the sake of brevity.

By using the explicit characterisation given in Proposition 1 together with Theorems 1 and 2, one can efficiently construct both an inner and an outer approximation of the (output) reachable set for linear systems with continuous and bounded input set $\Sigma^{c}$ , as summarised in the next corollary. Therein we also show how the same result can be extended to finite input sets $\Sigma^{d}$ .

Corollary 1 (The hyperplane method for linear systems).

Consider system (12) and suppose that either Assumption 1-(a) or Assumption 1-(b) holds. Let $v_{T}(c^{d})$ and $x^{\star}_{T}(c^{d})$ be computed as in (14) and (15). Then $\mathcal{R}^{out}_{T}(x_{0})$ and $\mathcal{R}^{in}_{T}(x_{0})$ ( $\mathcal{R}^{y,out}_{T}(x_{0})$ and $\mathcal{R}^{y,in}_{T}(x_{0})$ , resp.) as defined in Theorem 1 (Theorem 2, resp.) are outer and inner approximations of $\mathcal{R}_{T}(x_{0})$ (of $\mathcal{R}^{y}_{T}(x_{0})$ , resp.).

Proof.

In the case of continuous and bounded input, that is, under Assumption 1-(a), the reachable set $\mathcal{R}_{T}(x_{0})$ is convex and the statement is a trivial consequence on Theorems 1 and 2 and Proposition 1. We here show that the same result holds also under Assumption 1-(b). The proof of this second part follows from the fact that the reachable set $\mathcal{R}_{T}^{c}(x_{0})$ , obtained by using the continuous input set $\Sigma^{c}$ , and the reachable set $\mathcal{R}^{d}_{T}(x_{0})$ , obtained by using the discrete input set $\Sigma^{d}$ , coincide. To prove this, let $\mathcal{R}^{bb}_{T}(x_{0})$ be the reachable set obtained using $\Sigma^{bb}_{r}:=\{0,\bar{\sigma}_{r}\}$ for any $r$ , that is, the set of vertices of $\Sigma^{c}$ . Consider now an arbitrary point $\bar{x}\in\mathcal{R}_{T}^{c}(x_{0})$ , which is a compact set. By definition there exists an admissible input function in $\Sigma^{c}$ that steers $x_{0}$ to $\bar{x}$ in time $T$ . Since $\Sigma^{c}$ is a convex polyhedron, by [31, Theorem 8.1.2], system (12) with input set $\Sigma^{c}$ has the bang-bang with bound on the number of switchings (BBNS) property. That is, for each $\bar{x}\in\mathcal{R}_{T}^{c}(x_{0})$ there exists a bang-bang input function in $\Sigma^{bb}$ that reaches $\bar{x}$ in the same time $T$ with a finite number of discontinuities. Thus $\bar{x}\in\mathcal{R}_{T}^{bb}(x_{0})$ . Since this is true for any $\bar{x}\in\mathcal{R}_{T}^{c}(x_{0})$ , we get $\mathcal{R}_{T}^{c}(x_{0})\subseteq\mathcal{R}^{bb}_{T}(x_{0})$ . From $\Sigma^{bb}\subseteq\Sigma^{d}\subseteq\Sigma^{c}$ we get $\mathcal{R}^{bb}_{T}(x_{0})\subseteq\mathcal{R}^{d}_{T}(x_{0})\subseteq\mathcal{R}_{T}^{c}(x_{0})$ , concluding the proof. ∎

III-B Switched affine systems

In this section, we propose an extension of the hyperplane method to the case of a switched affine system of the form

[TABLE]

where the switching signal $\sigma(t)\in\mathbb{N}[1,I]$ is the input function, $I\geq 2$ is the number of modes, $x(t)\in\mathbb{R}^{n}$ and $A_{i}\in\mathbb{R}^{n\times n},b_{i}\in\mathbb{R}^{n}$ for all $i\in\mathbb{N}[1,I]$ . We make the following assumption.

Assumption 2.

The switching signal $\sigma(t)$ switches $K$ times within the finite set $\mathbb{N}[1,I]$ at fixed switching instants $0=t_{0}<\ldots<t_{K+1}=T$ , that is, $\sigma\in{\mathcal{S}}_{I}^{K}$ , where

[TABLE]

For every $k\in\mathbb{N}[0,K]$ and $i\in\mathbb{N}[1,I]$ we define $\bar{A}_{i}^{k}:=e^{A_{i}{(t_{k+1}-t_{k})}}$ and $\bar{b}_{i}^{k}=[\int_{0}^{(t_{k+1}-t_{k})}e^{A_{i}\tau}d\tau]b_{i}$ . Moreover, we set $x_{k}:=x(t_{k})$ . Note that under Assumption 2 the reachable set of system (16) consists of a finite number of points that can be computed by solving the state equations for each possible switching signal. Since the cardinality of the set ${\mathcal{S}}_{I}^{K}$ grows exponentially with $K$ , this approach is however computationally infeasible even for small systems. We here show that, on the other hand, the hyperplane constants defined in (11) can be computed by solving a mixed integer linear program (MILP), thus allowing us to exploit the sophisticated software that has been developed to solve large MILPs in the last years.

Proposition 2 (Tangent hyperplanes for switched affine systems).

Consider system (16) and suppose that Assumption 2 holds. Take a vector ${\bf M}\in\mathbb{R}^{n}$ such that ${\bf M}\geq|x_{k}|$ component-wise for all $k\in\mathbb{N}[0,K]$ . Then

[TABLE]

Proof.

To prove the statement we follow a procedure similar to the one in [32, Section IV.A]. Under Assumption 2 the switching signal $\sigma(t)$ is such that $\sigma(t)=i_{k},\forall t\in[t_{k},t_{k+1}),\forall k\in\mathbb{N}[0,K]$ . Therefore, the finite time optimal control problem in (11) can be rewritten as

[TABLE]

Let us introduce the binary variables $\gamma_{i}^{k}\in\{0,1\}$ defined so that, for each $i\in\mathbb{N}[1,I]$ and $k\in\mathbb{N}[0,K]$ , $\gamma_{i}^{k}=1$ if and only if the system is in mode $i$ in the time interval $[t_{k},t_{k+1})$ . Moreover, let us introduce a copy of the state vector for each possible update of the system in each possible mode: $z_{i}^{k+1}=(\bar{A}_{i}^{k}x_{k}+\bar{b}_{i}^{k})\gamma_{i}^{k}$ . Then (18) is equivalent to the following optimisation problem

[TABLE]

Finally, by using the big-M method in [33, Eq. (5b)], the first equality constraint in the optimization problem (19) can be equivalently replaced by

[TABLE]

leading to the equivalent reformulation given in (17). ∎

We summarize our results on the hyperplane method for switched affine systems in the next corollary, which is an immediate consequence of Proposition 2 and Theorems 1, 2.

Corollary 2 (The hyperplane method for switched affine systems).

Given system (16), let $x_{0}\in\mathbb{R}^{n}$ be the initial state and suppose that Assumption 2 holds. Let $v_{T}(c^{d})$ be computed as in (17). Then $\mathcal{R}^{out}_{T}(x_{0})$ and $\mathcal{R}^{y,out}_{T}(x_{0})$ as defined in Theorems 1 and 2 are outer approximations of $\mathcal{R}_{T}(x_{0})$ and $\mathcal{R}^{y}_{T}(x_{0})$ , respectively. $\square$

Note that in the case of switched affine systems it is not possible to recover an inner approximation, since there is no guarantee in general that the reachable set is convex. By computing the convex hull of the points $x_{K+1}$ in (17) for each direction $c$ one could however recover an inner approximation of the convex hull of $\mathcal{R}_{T}(x_{0})$ .

IV Controlled stochastic biochemical reaction networks

A biochemical reaction network is a system comprising $S$ molecular species $Z_{1}$ , …, $Z_{S}$ that interact through $R$ reactions. Let $Z(t)=[Z_{1}(t),...,Z_{S}(t)]^{\top}$ be the vector describing the number of molecules present in the network for each species at time $t$ , that is, the state of the network at time $t$ . Since each reaction $r$ is a stochastic event [8], $Z(t)$ is a stochastic process. In the following, we always use the upper case to denote a process and the lower case to denote its realizations. For example, $z=[z_{1},...,z_{S}]^{\top}$ denotes a particular realization of the state $Z(t)$ of the stochastic process at time $t$ .

A typical reaction $r\in\mathbb{N}[1,R]$ can be expressed as

[TABLE]

where $\nu^{\prime}_{1r},\ldots,\nu^{\prime}_{Sr}\in\mathbb{N}$ and $\nu^{\prime\prime}_{1r},\ldots,\nu^{\prime\prime}_{Sr}\in\mathbb{N}$ are the coefficients that determine how many molecules for each species are respectively consumed and produced by the reaction. The net effect of each reaction can thus be summarized with the stoichiometric vector $\nu_{r}\in\mathbb{N}^{S}$ , whose components are $\nu^{\prime\prime}_{sr}-\nu^{\prime}_{sr}$ for $s=1,\ldots,S$ . We say that a reaction is of order $k$ if it involves $k$ reactant units (i.e., $\sum_{s=1}^{S}\nu^{\prime}_{sr}=k$ ) and we distinguish two classes of reactions:

-uncontrolled reactions that happen, in the infinitesimal interval $[t,t+dt]$ , with probability

[TABLE]

where $h_{r}(z)$ is a given function of the available molecules $z$ and $\theta_{r}\in\mathbb{R}_{\geq 0}$ is the so-called rate parameter;

controlled reactions for which there exists an external signal $u_{r}(t)$ such that the reaction fires at time $t$ with probability

[TABLE]

In the following we refer to $\alpha_{r}(\theta_{r},z)$ as the propensity of the reaction and without loss of generality we assume that the controlled reactions are the first $Q$ ones. If $h_{r}(z):=\Pi_{s=1}^{S}\binom{z_{s}}{\nu^{\prime}_{sr}}$ we say that reaction $r$ follows the laws of mass action kinetics as derived in [8]. Our analysis can however be applied to generic functions $h_{r}(z)$ , allowing us to model different types of kinetics, as the Michaelis-Menten [34, Section 7.3].

To illustrate the following results, we consider a model of gene expression as running example.

Example 1 (Gene expression reaction network).

Consider a biochemical network consisting of two species, the mRNA ( $M$ ) and the corresponding protein ( $P$ ), and the following reactions

[TABLE]

where the parameters $k_{r}$ and $k_{p}$ are the mRNA and protein production rates, while $\gamma_{r}$ and $\gamma_{p}$ are the mRNA and protein degradation rates, respectively. The empty set notation is used whenever a certain species is produced or degrades without involving the other species. In this context, $Z=[M,P]^{\top}$ , $z=[m,p]^{\top}$ , $\theta=[\theta_{1},\theta_{2},\theta_{3},\theta_{4}]^{\top}:=[k_{r},\gamma_{r},k_{p},\gamma_{p}]^{\top}$ and the stoichiometric matrix is

[TABLE]

In the case of mass action kinetics the propensities $\alpha_{r}(\theta_{r},z)$ can be further specified as $\alpha_{1}(k_{r},z)=k_{r},\ \alpha_{2}(\gamma_{r},z)=\gamma_{r}\cdot m,\ \alpha_{3}(k_{p},z)=k_{p}\cdot m,\ \alpha_{4}(\gamma_{p},z)=\gamma_{p}\cdot p.$ ∎

Note that since the propensity of each reaction depends only on the current state of the system, the process $Z(t)$ is Markovian. Let $p(t,z):=\mathbb{P}[Z(t)=z]$ be the probability that the realization of the process $Z$ at time $t$ is $z$ . Following the same procedure as in [8] one can derive a set of equations, known as chemical master equation (CME), describing the evolution of $p(z,t)$ as a function of the external signal $u(t)$

[TABLE]

Since the previous set of equations depends on the external signal $u$ we refer to it as the controlled CME. Typical biochemical reaction networks involve many different species, whose counts can theoretically grow unbounded. Consequently, the controlled CME in (23) is a system of infinitely many coupled ordinary differential equations that cannot be solved, even for very simple systems. Several analytical and computational methods have been proposed in the literature to circumvent this difficulty, see [34, 35, 36] for a comprehensive review. In the following we limit our discussion to two methods: moment equations [37] and finite state projection (FSP) [13].

IV-A The moment equations

We start by considering the case when all the reactions follow the laws of mass action kinetics and are at most of order one. In this case for each reaction $r$ the propensity $h_{r}(z)$ is affine in the molecule counts vector $z$ and one can show that the moments equations are closed (i.e., the dynamics of moments up to any order $k$ do not depend on higher order moments), see for example [38]. Let $x_{\leq 2}(t)$ be a vector whose components are the moments of $Z(t)$ up to second order. From [38, Equations (6) and (7)] one gets

[TABLE]

Example 2.

Consider the gene expression model of Example 1. Assume that the reactions follow the mass action kinetics and that an external input signal influencing the first reaction, that is the mRNA production, is available (as in [1, 2, 3, 4, 5]), so that $\alpha_{1}(k_{r},z):=k_{r}\cdot u(t)$ . Set

[TABLE]

Then the moments evolution over time is expressed as

[TABLE]

where

[TABLE]

∎

Since the input $u(t)$ may appear in the entries of the $A$ matrix, the moment equations (24) are in general nonlinear. To overcome this issue we introduce the following assumption on the external signal $u(t)$ .

Assumption 3.

The external signal $u(t)$ can switch at most $K$ times within the set $\Sigma^{d}$ , as defined in Assumption 1, at preassigned switching instants $0=t_{0}<\ldots<t_{K+1}=T$ .

Assumption 3 imposes that the number of switchings and their timing during a given experiment is fixed a priori. This assumption can be motivated by the fact that changes in the external stimulus are costly and/or stressful for the cells. Moreover, it is trivially satisfied if the stimulus can only be changed simultaneously with some fixed events, such as culture dilution or measurements. The great advantage of Assumption 3 is that, as illustrated in the following remark, it allows us to rewrite the nonlinear moment equations (24) as a switched affine system so that the theoretical tools described in Section III-B can be applied.

Remark 2.

The set $\Sigma^{d}$ has finite cardinality $I:=\Pi_{r=1}^{m}q_{r}$ and we can enumerate its elements as $u^{i},i\in\mathbb{N}[1,I]$ . Consequently, for any fixed external signal $u(t)$ satisfying Assumption 3 we can construct a sequence of indices in $\mathbb{N}[1,I]$ such that, at any time $t$ , $\sigma(t)=i$ if and only if $u(t)=u^{i}$ . Such switching sequence $\sigma$ satisfies Assumption 2. ∎

IV-B The finite state projection

Let us introduce a total ordering $\{z^{j}\}_{j=1}^{\infty}$ in the set of all possible state realizations $z\in\mathbb{N}^{S}$ . For the system in Example 1, we could for instance use the mapping

[TABLE]

where $(m,p)$ denotes the state with $m$ mRNA copies and $p$ proteins (see Fig. 2).

Following the same steps as in [13] and setting111Not to be confused with the symbol used to denote the amount of protein. $P_{j}(t):=p(z^{j},t)$ , the controlled CME in (23) can be rewritten as the nonlinear infinite dimensional system

[TABLE]

where $P(t)$ is an infinite dimensional vector with entries in $[0,1]$ . If the signal

$u(t)$ satisfies Assumption 3, then (26) can be rewritten as an infinite dimensional linear switched system

[TABLE]

with switching signal $\sigma(t)$ constructed from $u(t)$ as detailed in Remark 2, $I=\Pi_{r=1}^{m}q_{r}$ modes and matrices $F_{i}:=F({u^{i}})$ . Note that system (27) can also be thought of as a Markov chain with countably many states $z^{j}\in\mathbb{N}^{S}$ and time-varying transition matrix $F_{\sigma(t)}$ .

As in the FSP method for the uncontrolled CME [13], one can try to approximate the behavior of the infinite Markov chain in (27) by constructing a reduced Markov chain that keeps track of the probability of visiting only the states indexed in a suitable set $J$ . To this end, let us define the reduced order system

[TABLE]

where $P_{J}(0)$ is the subvector of $P(0)$ corresponding to the indices in $J$ , and $[F]_{J}$ denotes the submatrix of $F$ obtained by selecting only the rows and columns with indices in $J$ . Note that while the full matrix $F_{\sigma(t)}$ is stochastic, the reduced matrix $\left[F_{\sigma(t)}\right]_{J}$ is substochastic. Consequently, the probability mass is in general not preserved in (28) (i.e. $\mathbbm{1}^{\top}{\bar{P}}_{J}(t)$ may decrease with time). From now on, we denote by $P(T;\sigma)$ and $\bar{P}_{J}(T;\sigma)$ the solutions at time $T$ of system (27) and system (28), respectively, when the switching signal $\sigma$ is applied. The dependence on the initial conditions $P(0)$ and $P_{J}(0)$ is omitted to keep the notation compact. As in the uncontrolled case, the truncated system (28) is a good approximation of the original system (27) if most of the probability mass lies in $J$ . However in the controlled case we need to guarantee that this happens for all possible switching signals. This intuition is formalized in the following assumption.

Assumption 4.

For a given finite set of state indices $J$ , an initial condition $P_{J}(0)$ , a given tolerance $\varepsilon>0$ and a finite instant $T>0$ ,

[TABLE]

Note that Assumption 4 holds if and only if

[TABLE]

This problem has the same structure as (11). Therefore, as illustrated in Section III-B, Assumption 4 can be checked by solving the MILP (17) for the switched affine system (28) by setting $c=\mathbbm{1}$ and ${\bf M}=\mathbbm{1}$ . Under Assumption 4, the following relation between the solutions of (27) and (28) holds.

Proposition 3 (FSP for controlled CME).

If Assumptions 2 and 4 hold, then for every switching signal $\sigma\in\mathcal{S}^{K}_{I}$ , it holds

[TABLE]

Proof.

This result has been proven in [13] for linear systems. We extend it here to the case of switched systems with $K$ switchings. Note that for any $i\in\mathbb{N}[1,I]$ , $F_{i}:=F({u^{i}})$ has non-negative off diagonal elements [13]. Hence, using the same argument as in [13, Theorem 2.1] it can be shown that for any index set $J$ , and any $\tau\geq 0$

[TABLE]

Consider an arbitrary switching signal $\sigma\in\mathcal{S}^{K}_{I}$ . We have

[TABLE]

Moreover, from $1=\sum_{j=1}^{\infty}P_{j}(T;\sigma)\geq\sum_{j\in J}P_{j}(T;\sigma)=\mathbbm{1}^{\top}P_{J}(T;\sigma)$ and Assumption 4, we get

[TABLE]

Combining (30) and (31) yields $0\leq\mathbbm{1}^{\top}P_{J}(T;\sigma)-\mathbbm{1}^{\top}\bar{P}_{J}(T;\sigma)\leq\varepsilon$ , thus $\|P_{J}(T;\sigma)-\bar{P}_{J}(T;\sigma)\|_{1}\leq\varepsilon$ . ∎

V Analysis of the reachable set

We here show how the reachability tools of Sections II and III can be applied to the moment equation and FSP reformulations derived in Sections IV-A and IV-B, under different assumptions. Fig. 3 presents a conceptual scheme of this section.

V-A Reachable set of networks with affine propensities via moment equations

The methods developed in Sections II and III can be applied to the moments equations in (24) to approximate the desired projected reachable set. To illustrate the proposed procedure, we distinguish two cases depending on whether the external signal $u(t)$ influences reactions of order zero or one.

V-A1 Linear moments equations

We start by considering the case when all and only the reactions of order zero are controlled, so that $h_{r}(z)=1$ for $r\in\mathbb{N}[1,Q]$ and $h_{r}(z)={\nu_{r}^{\prime}}^{\top}z$ for $r\in\mathbb{N}[Q+1,R]$ . This is the simplest scenario since the system of moment equations given in (24) becomes linear

[TABLE]

see [38, Equations (6) and (7)]. Consequently, the theoretical results of Section III-A can be applied to (32) by setting $\sigma(t)\equiv u(t)$ . If the external signal $u\equiv\sigma$ satisfies Assumption 1, both inner and outer approximations of the reachable set can be computed by using Corollary 1.

V-A2 Switched affine moments equations

If reactions of order one are controlled then the external input $u(t)$ appears also in the entries of the $A$ matrix and system (24) is nonlinear. To overcome this issue we exploit Assumption 3. Specifically, let $\sigma(t)$ be the switching signal associated with $u(t)$ as described in Remark 2. Then (24) can be equivalently rewritten as the switched affine system

[TABLE]

with matrices $A_{i}:=A(u^{i}),b_{i}:=b(u^{i})$ , for all $i\in\mathbb{N}[1,I]$ . Consequently, the theoretical results of Section III-B can be applied to (33) and an outer approximation of the reachable set can be computed by using Corollary 2.

V-B Reachable set of networks with generic propensities via finite state projection

If the network contains reactions of order higher than one or if the reactions do not follow the laws of mass action kinetics, then $h_{r}(z)$ might be non-affine. In such cases, the arguments illustrated in the previous subsection cannot be applied. We here show how the FSP approximation of the CME derived in Section IV-B can be used to overcome this problem.

Firstly note that, from system (27), one can compute the evolution of the uncentered moments of $Z(t)$ , as a linear function of $P(t)$ . 222The reachable set for the centered moments can be immediately computed from the reachable set of the uncentered ones, since there is a bijective relation between the set of centered and uncentered moments up to any desired order. For example, if we let $z_{s}^{j}$ be the amount of species $Z_{s}$ in the state $z^{j}$ , then the mean $\mathbb{E}[Z_{s}]$ of any species $s$ can be obtained as $l^{\top}P(t)$ , by setting $l:=\left[z_{s}^{1},\ z_{s}^{2},\ \ldots\right]^{\top}$ , and the second uncentered moment $\mathbb{E}[Z^{2}_{s}]$ can be obtained as $l^{\top}P(t)$ , by setting $l:=\left[(z_{s}^{1})^{2},\ (z_{s}^{2})^{2},\ \ldots\right]^{\top}.$ Consequently the desired projected reachable set coincides with the output reachable set of the infinite dimensional linear switched system (27) with linear output

[TABLE]

where $l^{1}$ and $l^{2}$ are the infinite vectors associated with any desired pair of moments. Note that $l^{1}$ and $l^{2}$ are non-negative.

Example 1 (cont.)* With the ordering introduced at the beginning of the section, the uncentered protein moments up to order two can be computed as the output of (27) by setting*

[TABLE]

Let $l_{j}^{1}$ and $l_{j}^{2}$ be the $j$ -th components of the vectors $l^{1}$ and $l^{2}$ , respectively, as defined in (34). For a given species of interest $s$ and set $J$ , we denote by

[TABLE]

the moments associated with $l^{1}$ and $l^{2}$ conditioned on the fact that $Z(t)$ is in $J$ and the switching signal $\sigma$ is applied. For example if one is interested in the mean and second order moment of a specific species $Z_{s}(t)$ we get $y_{1}(t;\sigma)=\mathbb{E}\left[Z_{s}(t)\mid Z(t)\in J,\sigma(\cdot)\right]$ and $y_{2}(t;\sigma)=\mathbb{E}\left[Z^{2}_{s}(t)\mid Z(t)\in J,\sigma(\cdot)\right]$ . The aim of this section is to obtain an outer approximation of the output reachable set of the infinite system (27) with the nonlinear output (36), by using computations involving only the finite dimensional system (28). To this end, we define the two entries of the output of the finite dimensional system as

[TABLE]

Theorem 3.

Suppose Assumptions 3 and 4 hold. Let ${\mathcal{R}}_{T}^{y}(x_{0})$ be the output reachable set at time $T>0$ of system (27) with output (36). Choose $D$ values $\gamma^{d}\in\mathbb{R}$ and set $c^{d}:=(\bar{l}^{2})-\gamma^{d}(\bar{l}^{1})\in\mathbb{R}^{n}$ , with $\bar{l}^{1},\bar{l}^{2}$ as in (37). Set

[TABLE]

where $\bar{v}_{T}(c^{d})$ is the constant that makes the hyperplane $\textup{H}_{T}(c^{d})$ in (5) tangent to the reachable set of the finite system (28) (i.e. $\bar{v}_{T}(c^{d})$ can be computed as in (17)) and

[TABLE]

with $\varepsilon$ as in Assumption 4. Then the set $\mathcal{R}_{T}^{y,out}(x_{0}):=\cap_{d=1}^{D}\{\mathcal{H}_{T}^{y}(\gamma^{d})\}$ is an outer approximation of $\mathcal{R}_{T}^{y}(x_{0})$ . ∎

Proof.

Firstly note that if the external signal $u$ satisfies Assumption 3 then the corresponding switching signal $\sigma(t)$ (constructed as in Remark 2) satisfies Assumption 2. Let $\bar{\mathcal{R}}_{T}^{y}(x_{0})$ be the output reachable set of the finite dimensional system (28) with output (37). Proposition 2 guarantees that for any direction $c^{d}$ the constant $\bar{v}_{T}(c^{d})$ that makes

[TABLE]

tangent to $\bar{\mathcal{R}}_{T}^{y}(x_{0})$ can be computed by solving the MILP (17) for system (28). The main idea of the proof is to show that if we shift the halfspace $\bar{\mathcal{H}}^{y}_{T}(\gamma^{d})$ by a suitably defined constant $\delta(\gamma^{d})$ we can guarantee that the original reachable set ${\mathcal{R}}_{T}^{y}(x_{0})$ is a subset of the shifted halfspace ${\mathcal{H}}^{y}_{T}(\gamma^{d})$ defined in the statement. The result then follows since $\mathcal{R}_{T}^{y,out}(x_{0})$ is defined as the intersection of hyperspaces containing $\mathcal{R}_{T}^{y}(x_{0})$ .

To derive the constant $\delta(\gamma^{d})$ we start by focusing on the first component of the output and for simplicity we will omit the dependence on $(T;\sigma)$ in $P_{j},\bar{P}_{j},y$ and $\bar{y}$ . Take any switching signal $\sigma\in\mathcal{S}^{K}_{I}$ . By taking into account the following conditions: (1) $l_{j}^{1}\geq 0$ for all $j\in J$ ; (2) $P_{j}\geq\bar{P}_{j}$ for all $j\in J$ , due to Proposition 3, and (3) $\sum_{j\in J}P_{j}\leq 1$ , we get $y_{1}\geq\bar{y}_{1}$ . Consequently, at time $t=T$ we have

[TABLE]

where we used $\sum_{j\in J}P_{j}\geq\sum_{j\in J}\bar{P}_{j}\geq 1-\varepsilon$ (due to Assumption 4), and $P_{j}\geq\bar{P}_{j},\|P_{J}-\bar{P}_{J}\|_{1}\leq\varepsilon$ (following from Proposition 3). To summarize, $\bar{y}_{1}\leq y_{1}\leq\bar{y}_{1}+\|\bar{l}^{1}\|_{\infty}\frac{2\varepsilon}{1-\varepsilon}.$ Similarly, it can be proven that $\bar{y}_{2}\leq y_{2}\leq\bar{y}_{2}+\|\bar{l}^{2}\|_{\infty}\frac{2\varepsilon}{1-\varepsilon}.$ Consider any pair $(y_{1},y_{2})\in{\mathcal{R}}_{T}^{y}(x_{0})$ and the associated pair $(\bar{y}_{1},\bar{y}_{2})\in\bar{\mathcal{R}}_{T}^{y}(x_{0})$ (i.e. the two output pairs obtained from (27) and (28) when the same $\sigma$ is applied). Note that $(\bar{y}_{1},\bar{y}_{2})\in\bar{\mathcal{R}}_{T}^{y}(x_{0})$ implies $(\bar{y}_{1},\bar{y}_{2})\in\bar{\mathcal{H}}^{y}_{T}(\gamma^{d})$ for any $\gamma^{d}.$ The previous relations then imply that if $\gamma^{d}\geq 0$ ,

[TABLE]

On the other hand, when $\gamma^{d}<0$

[TABLE]

Therefore for every signal $\sigma$ and every $\gamma^{d}$ it holds $y_{2}(T;\sigma)\leq\gamma^{d}y_{1}(T;\sigma)+\bar{v}_{T}(c^{d})+\delta(\gamma^{d})$ and consequently $[y_{1}(T;\sigma),y_{2}(T;\sigma)]^{\top}\in{\mathcal{H}}^{y}_{T}(\gamma^{d})$ . ∎

VI Analysis of single cell realizations

The previous analysis focused on characterising what combinations of moments of the stochastic biochemical reaction network are achievable by using the available external input. In this section, we change perspective and instead of looking at population properties we focus on single cell trajectories. Specifically, we are interested in characterising the probability that a single realization of the stochastic process will satisfy a specific property at the final time $T$ (e.g. the number of copies of a certain species is higher/lower than a certain threshold) when starting from an initial condition $P(0)$ . Note that we can start either deterministically from a given state $z^{i}$ (by setting $P(0)=e_{i}$ ) or stochastically from any state according to a generic vector of probabilities $P(0)$ . To define the problem let us call $\mathcal{T}$ the target set, that is, the set of all indices $i$ associated with a state $z^{i}$ in the Markov chain (26) that satisfies the desired property. Note that this set might be of infinite size. We restrict our analysis to external signals satisfying Assumption 3, so that we can map the external signal $u$ to the switching signal $\sigma$ , as detailed in Remark 2. For a fixed signal $\sigma$ the solution of (27) immediately allows one to compute the probability that the state at time $T$ belongs to $\mathcal{T}$ , and thus has the desired property, as $\mathcal{P}_{\mathcal{T}}(\sigma):=\mathbbm{1}_{\mathcal{T}}^{\top}P(T;\sigma)$ where $\mathbbm{1}_{\mathcal{T}}$ is an infinite vector that has the $i$ th component equal to $1$ if $i\in\mathcal{T}$ and [math] otherwise. Our objective is to select the switching signal $\sigma(t)$ (and thus the external signal $u(t)$ ) that maximizes the probability $\mathcal{P}_{\mathcal{T}}(\sigma)$ .333Note that one can use the same tools to maximize the probability of avoiding a given set $\mathcal{D}$ by maximizing the probability of being in $\mathcal{T}=\mathcal{D}^{c}$ . That is, we aim at solving

[TABLE]

where $I$ is the cardinality of $\Sigma^{d}$ as by Remark 2. Note that $\mathcal{P}_{\mathcal{T}}(\sigma)$ in (38) is computed according to $P(T;\sigma)$ which is an infinite dimentional vector. In the next theorem we show how to overcome this issue and approximately solve (38) by using the FSP approach of Proposition 3 and the reformulation as MILP given in Proposition 2. To this end, let

[TABLE]

where $\bar{\mathcal{P}}_{\mathcal{T}}(\sigma):=\bar{\mathbbm{1}}_{\mathcal{T}}^{\top}\bar{P}_{J}(T;\sigma)$ is the probability that the final state of the reduced Markov chain (28) belongs to $\mathcal{T}\cap J$ at time $T$ given the switching signal $\sigma$ , and $\bar{\mathbbm{1}}_{\mathcal{T}}$ is a vector of size $|J|$ that has $1$ in the positions corresponding to states of $J$ that belong also to $\mathcal{T}$ , and [math] otherwise.

Theorem 4.

Suppose that Assumptions 3 and 4 hold. Then

[TABLE]

Moreover (39) can be solved by solving the MILP in (17) for system (28) with $c=\bar{\mathbbm{1}}_{\mathcal{T}}$ and $\bf{M}=\mathbbm{1}$ . ∎

Proof.

Under Assumption 3 and 4, for any set $\mathcal{T}$ and any signal $\sigma$ , we get

[TABLE]

and $\mathbbm{1}_{\mathcal{T}}^{\top}P=\sum_{i\in\mathcal{T}}P_{i}\geq\sum_{i\in\mathcal{T}\cap J}P_{i}\geq\sum_{i\in\mathcal{T}\cap J}\bar{P}_{i}=\bar{\mathbbm{1}}_{\mathcal{T}}^{\top}\bar{P},$ where we used Assumption 4 and Proposition 3 and we omitted $(T;\sigma)$ for simplicity. To sum up, for each $\sigma$ ,

[TABLE]

By imposing $\sigma=\sigma^{\star}$ we get $\mathcal{P}^{\star}_{\mathcal{T}}=\mathcal{P}_{\mathcal{T}}(\sigma^{\star})\leq\bar{\mathcal{P}}_{\mathcal{T}}(\sigma^{\star})+2\varepsilon\leq\bar{\mathcal{P}}_{\mathcal{T}}(\bar{\sigma}^{\star})+2\varepsilon.$ By imposing $\sigma=\bar{\sigma}^{\star}$ we get $\bar{\mathcal{P}}_{\mathcal{T}}(\bar{\sigma}^{\star})\leq\mathcal{P}_{\mathcal{T}}(\bar{\sigma}^{\star}).$ Combining the last two inequalities we get the desired bound. The last result can be proven as in Proposition 2. Note that $\bar{P}$ is a vector of probabilities, hence we can set $\bf{M}=\mathbbm{1}$ . ∎

VII The gene expression network case study

To illustrate our method we consider again the gene expression model of Example 1 and determine what combinations of the protein mean and variance are achievable starting from the zero state, under different assumptions on the external signal.

VII-A * Single input*

Consider the gene expression model with one external signal and reactions following the mass action kinetics, as described in Example 2. In this case, the moments equations are linear and the protein mean and variance can be obtained by assuming as output matrix for the linear system (25)

[TABLE]

Depending on the experimental setup, the external signal $u(t)$ may take values in the set $\Sigma^{d}:=\{0,1\}$ , if the input is of the ON-OFF type [1, 2, 3, 5], or in the interval $\Sigma^{c}:=\left[0,1\right]$ , if the input is continuous [4]. Corollary 1 guarantees the validity of the following results both for $\Sigma^{d}$ and $\Sigma^{c}$ . The problem of computing an outer approximation of the reachable set of this system was studied in [27] using ad hoc methods. In Fig. 4 we compare the outer approximation obtained therein (magenta dashed/dotted line) with the inner (solid red) and outer (dashed blue) approximations that we obtained using the methods for linear moment equations of Section V-A1. We used the parameters $k_{r}=0.0236,\gamma_{r}=0.0503,k_{p}=0.18,\gamma_{p}=0.0121$ (all in units of min*-1*) and set $T=360$ min. Figure 4 shows that the outer approximation computed using the hyperplane method is more accurate than the one previously obtained in the literature. Moreover, since inner and outer approximations practically coincide, this method allows one to effectively recover the reachable set.

VII-B Single input and saturation

As second case study we consider again Example 2, but we now assume that not all the reactions follow the laws of mass action kinetics. Specifically, we are interested in investigating how the reachable set changes if we assume that the number of ribosomes in the cell is limited and consequently we impose a saturation to the translation propensity. Following [39], we assume that the translation rate follows the Michaelis-Menten kinetics so that

[TABLE]

For the simulations we impose $\tilde{k}_{p}=0.7885,b=0.06,a=0.02$ , so that the maximum reachable protein mean is the same as in the case without saturation analysed in the previous subsection. The corresponding propensity function is illustrated in Fig. 5a). All the other propensities are assumed as in Section VII-A. Note that in this case the propensities are not affine. Consequently, we estimate the reachable set by using the FSP approach derived in Theorem 3. Specifically we consider as set $J$ the indices corresponding to states with less than $6$ mRNA copies and $40$ protein copies. By assuming $T=360$ min and that $u$ can switch any $30$ minutes in the set $\Sigma^{d}=\{0,1\}$ , we obtain an error $\varepsilon=2.84\cdot 10^{-4}$ . Fig. 5b) shows the comparison of the reachable sets obtained for the cases with and without saturation. From this plot it emerges that, for the chosen values of parameters, saturation leads to a decrease of variability in the population.

VII-C Fluorescent protein and the two inputs case

Consider again Example 1, but now assume that:

mRNA production and degradation can both be controlled, so that the vector of propensities is $\alpha(z)=[k_{r}\cdot u_{1}(t),\ \gamma_{r}\cdot m\cdot u_{2}(t),\ k_{p}\cdot m,\ \gamma_{p}\cdot p]^{\top}$ and $u(t):=\left[\begin{smallmatrix}u_{1}(t)\\ u_{2}(t)\end{smallmatrix}\right]$ ; 2. 2.

the protein $P$ can mature into a fluorescent protein $F$ according to the additional maturation and degradation reactions

[TABLE]

where $\alpha_{5}(k_{f},z):=k_{f}\cdot p,\quad\alpha_{6}(\gamma_{p},z):=\gamma_{p}\cdot f$ and $k_{f}>0$ is the maturation rate. For simplicity, the degradation rate of $F$ is assumed to be the same as that of $P$ ; 3. 3.

the fluorescence intensity $I(t)$ of each cell can be measured and is proportional to the amount of fluorescence protein, that is, $I(t)=rF(t)$ for a fixed scaling parameter $r>0$ .

Since all the propensities are affine, the system describing the evolution of means and variances of the augmented network is

[TABLE]

where the state vector $x_{\leq 2}(t)$ and $A^{f}({u(t)}),b^{f}({u(t)})$ are

[TABLE]

System (40) depends on the parameter vector $\theta=[k_{r},\gamma_{r},k_{p},\gamma_{p},k_{f},r]$ (for more details see [40, Supplementary Information pg. 16]). For the parameters we use the MAP estimates identified in [28] (all in min*-1*)

[TABLE]

and we set

[TABLE]

to compute the mean and variance reachable set for the fluorescence intensity.

Our first aim is to compare the reachable set of such extended model with experimental data, when only one external signal (“1in”) is available. In the case of one input, (40) is a linear system and the methods of Section V-A1 can be applied. Fig. 6a) shows the estimated reachable set compared with the real data collected in [2].

Our second goal is to investigate how the reachable set changes when both mRNA production and degradation are controlled (“2in”), as studied in [41]. Note that in this case, system (40) is nonlinear. We therefore set $T=300$ min and assume that switchings can occur every $20$ min, so that Assumption 3 is satisfied with $K=15$ and use the hyperplane method as described in Section V-A2 with input sets

[TABLE]

respectively. Note that we set the minimum input for the mRNA degradation to $0.5>0$ to avoid unboundedness. With these input choices it is intuitive that the largest possible state is reached when the mRNA production is at its maximum and the mRNA degradation is at its minimum. Therefore, in the MILPs we can use the bounds ${\bf M}=x\left(T;0,u(t)=\left[\begin{smallmatrix}1\\ 0.5\end{smallmatrix}\right]\,\forall t\right)$ for the case of two inputs and ${\bf M}=x(T;0,u(t)=\left[\begin{smallmatrix}1\\ 1\end{smallmatrix}\right]\,\forall t),$ for the case of one input. Fig. 6b) shows the output reachable set for the case of two inputs. The simulation time for computing the outer approximation with the hyperplane method was $5.6$ hrs. Computing the exact reachable set by simulating all the possible switching signals, assuming that one simulation takes $10^{-4}$ sec and neglecting the time needed to enumerate all possible signals, would take $29.8$ hrs. The black crosses in Fig. 6b) are obtained by simulating the output of the system for $5000$ randomly constructed input signals. This simulation illustrates that random approaches might lead to significantly under estimate the reachable set. Fig. 6c) shows a comparison of the reachable sets obtained in Fig. 6a) and b) when the input set is $\Sigma^{\textup{1in}}$ and $\Sigma^{\textup{2in}}$ , respectively.

VIII Conclusion

In the paper we have: i) proposed a method to approximate the projected reachable set of switched affine systems with fixed switching times, ii) extended the FSP approach to controllable networks, iii) illustrated how these new theoretical tools can be used to analyse generic networks both from a population and single cell perspective and iv) provided an extensive gene expression case study using both in silico and in vivo data. Even though our analysis is motivated by biochemical reaction networks, our results can actually be applied to study the moments of any Markov chain with transitions rates that switch among $I$ possible configurations at $K$ fixed instants of times. Our results hold both in case of finite and infinite state space. Moreover, while we have assumed here that cells are identical, we showed in [29] that also in the case of heterogeneous population one can derive equations describing the moments evolution. The reachable set of such populations can be obtained, as described in this paper, by applying Corollary 1 or 2 to such system.

Bibliography41

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] A. Milias-Argeitis, S. Summers, J. Stewart-Ornstein, I. Zuleta, D. Pincus, H. El-Samad, M. Khammash, and J. Lygeros, “In silico feedback for in vivo regulation of a gene expression circuit,” Nature Biotechnology , vol. 29, pp. 1114– 1116, 2011.
2[2] J. Ruess, F. Parise, A. Milias-Argeitis, M. Khammash, and J. Lygeros, “Iterative experiment design guides the characterization of a light-inducible gene expression circuit,” National Academy of Sciences of the USA , vol. 112, no. 26, pp. 8148–8153, 2015.
3[3] E. J. Olson, L. A. Hartsough, B. P. Landry, R. Shroff, and J. J. Tabor, “Characterizing bacterial gene circuit dynamics with optically programmed gene expression signals.” Nature Methods , vol. 11, pp. 449–455, 2014.
4[4] J. Uhlendorf, A. Miermont, T. Delaveau, G. Charvin, F. Fages, S. Bottani, G. Batt, and P. Hersen, “Long-term model predictive control of gene expression at the population and single-cell levels.” National Academy of Sciences of the USA , vol. 109, no. 35, pp. 14 271–14 276, 2012.
5[5] F. Menolascina, M. Di Bernardo, and D. Di Bernardo, “Analysis, design and implementation of a novel scheme for in-vivo control of synthetic gene regulatory networks,” Automatica , pp. 1265–1270, 2011.
6[6] G. Batt, H. De Jong, M. Page, and J. Geiselmann, “Symbolic reachability analysis of genetic regulatory networks using discrete abstractions,” Automatica , vol. 44, no. 4, pp. 982–989, 2008.
7[7] N. Chabrier and F. Fages, “Symbolic model checking of biochemical networks,” in Computational Methods in Systems Biology , C. Priami, Ed. Springer, 2003, pp. 149–162.
8[8] D. T. Gillespie, “A rigorous derivation of the chemical master equation,” Physica A: Statistical Mechanics and its Applications , pp. 404–425, 1992.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Computing the projected reachable set of switched affine systems: an application to systems biology

Abstract

I Introduction

Comparison with the literature

Outline

Notation

II Reachability tools

II-A The reachable set and the hyperplane method

Definition 1** (Reachable set at time TTT).**

Theorem 1** (The hyperplane method [12]).**

Remark 1**.**

II-B The output reachable set

Definition 2** (Output reachable set at time TTT).**

Theorem 2** (Projection on a two dimensional subspace).**

Proof.

III Computing the tangent hyperplanes

III-A Linear systems with bounded input

Assumption 1**.**

Proposition 1** (Tangent hyperplanes for linear systems with bounded and continuous inputs).**

Corollary 1** (The hyperplane method for linear systems).**

Proof.

III-B Switched affine systems

Assumption 2**.**

Proposition 2** (Tangent hyperplanes for switched affine systems).**

Proof.

Corollary 2** (The hyperplane method for switched affine systems).**

IV Controlled stochastic biochemical reaction networks

Example 1** (Gene expression reaction network).**

IV-A The moment equations

Example 2**.**

Assumption 3**.**

Remark 2**.**

IV-B The finite state projection

Assumption 4**.**

Proposition 3** (FSP for controlled CME).**

Proof.

V Analysis of the reachable set

V-A Reachable set of networks with affine propensities via moment equations

V-A1 Linear moments equations

V-A2 Switched affine moments equations

V-B Reachable set of networks with generic propensities via finite state projection

Theorem 3**.**

Proof.

VI Analysis of single cell realizations

Theorem 4**.**

Proof.

VII The gene expression network case study

VII-A * Single input*

VII-B Single input and saturation

VII-C *Fluorescent protein and the two inputs case *

VIII Conclusion

Definition 1 (Reachable set at time $T$ ).

Theorem 1 (The hyperplane method [12]).

Remark 1.

Definition 2 (Output reachable set at time $T$ ).

Theorem 2 (Projection on a two dimensional subspace).

Assumption 1.

Proposition 1 (Tangent hyperplanes for linear systems with bounded and continuous inputs).

Corollary 1 (The hyperplane method for linear systems).

Assumption 2.

Proposition 2 (Tangent hyperplanes for switched affine systems).

Corollary 2 (The hyperplane method for switched affine systems).

Example 1 (Gene expression reaction network).

Example 2.

Assumption 3.

Remark 2.

Assumption 4.

Proposition 3 (FSP for controlled CME).

Theorem 3.

Theorem 4.

VII-C Fluorescent protein and the two inputs case