A finite state projection algorithm for the stationary solution of the chemical master equation
Ankit Gupta, Jan Mikelson, Mustafa Khammash

TL;DR
This paper introduces the stationary Finite State Projection (sFSP) algorithm, enabling efficient approximation of stationary solutions of the chemical master equation in systems biology, even for very large state-spaces.
Contribution
The paper develops the sFSP method, extending FSP to accurately estimate stationary distributions of CMEs using finite linear systems and tensor train techniques.
Findings
sFSP provides accurate stationary distribution approximations
Error bounds can be minimized by expanding the truncated state-space
Efficiently solves problems with over 100 million states
Abstract
The chemical master equation (CME) is frequently used in systems biology to quantify the effects of stochastic fluctuations that arise due to biomolecular species with low copy numbers. The CME is a system of ordinary differential equations that describes the evolution of probability density for each population vector in the state-space of the stochastic reaction dynamics. For many examples of interest, this state-space is infinite, making it difficult to obtain exact solutions of the CME. To deal with this problem, the Finite State Projection (FSP) algorithm was developed by Munsky and Khammash (Jour. Chem. Phys. 2006), to provide approximate solutions to the CME by truncating the state-space. The FSP works well for finite time-periods but it cannot be used for estimating the stationary solutions of CMEs, which are often of interest in systems biology. The aim of this paper is to…
| Iteration | Cut-offs | State-space size | Convergence factor | CPU Time (seconds) | ||
|---|---|---|---|---|---|---|
| Constructing | Finding | |||||
| 13.6 | 18.7 | |||||
| 27.5 | 46.4 | |||||
| 40.4 | 75.6 | |||||
| 53.6 | 110.7 | |||||
| 68.3 | 159.6 | |||||
| Iteration | Cut-offs | State-space size | Convergence factor | CPU Time (seconds) | ||
|---|---|---|---|---|---|---|
| Constructing | Finding | |||||
| 7.6 | 13.1 | |||||
| 14.8 | 24.9 | |||||
| 23.3 | 37.2 | |||||
| 30.63 | 55.9 | |||||
| 37.66 | 77.1 | |||||
| Iteration | Cut-offs | State-space size | Convergence factor | CPU Time (seconds) | ||
|---|---|---|---|---|---|---|
| Constructing | Finding | |||||
| 0.00062 | 0.0344 | |||||
| 0.00117 | 0.0566 | |||||
| 0.001719 | 0.0517 | |||||
| 0.002906 | 0.0519 | |||||
| 0.003498 | 0.0464 | |||||
| 0.00255 | 0.0486 | |||||
| No. | Reaction | Propensity |
|---|---|---|
| 1 | ||
| 2 | ||
| 3 | ||
| 4 | ||
| 5 | ||
| 6 | ||
| 7 | ||
| 8 | ||
| 9 | ||
| 10 | ||
| 11 | ||
| 12 | ||
| 13 | ||
| 14 | ||
| 15 | ||
| 16 | ||
| 17 | ||
| 18 |
| Iteration | Upper bounds | State-space size | Convergence factor | CPU Time (minutes) | |
|---|---|---|---|---|---|
| mRNAs | Proteins | t | |||
| 3.66 | |||||
| 6.77 | |||||
| 27.67 | |||||
| 60.09 | |||||
| 69.66 | |||||
| No. of SSA samples | CPU Time | |
|---|---|---|
| 0.5969 | 12 minutes | |
| 0.2461 | 117 minutes | |
| 0.091 | 1076 minutes |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
A finite state projection algorithm for the stationary solution of the chemical master equation
Ankit Gupta, Jan Mikelson and Mustafa Khammash
Department of Biosystems Science and Engineering
ETH Zurich
Mattenstrasse 26
4058 Basel, Switzerland
Abstract
The chemical master equation (CME) is frequently used in systems biology to quantify the effects of stochastic fluctuations that arise due to biomolecular species with low copy numbers. The CME is a system of ordinary differential equations that describes the evolution of probability density for each population vector in the state-space of the stochastic reaction dynamics. For many examples of interest, this state-space is infinite, making it difficult to obtain exact solutions of the CME. To deal with this problem, the Finite State Projection (FSP) algorithm was developed by Munsky and Khammash (Jour. Chem. Phys. 2006), to provide approximate solutions to the CME by truncating the state-space. The FSP works well for finite time-periods but it cannot be used for estimating the stationary solutions of CMEs, which are often of interest in systems biology. The aim of this paper is to develop a version of FSP which we refer to as the stationary FSP (sFSP) that allows one to obtain accurate approximations of the stationary solutions of a CME by solving a finite linear-algebraic system that yields the stationary distribution of a continuous-time Markov chain over the truncated state-space. We derive bounds for the approximation error incurred by sFSP and we establish that under certain stability conditions, these errors can be made arbitrarily small by appropriately expanding the truncated state-space. We provide several examples to illustrate our sFSP method and demonstrate its efficiency in estimating the stationary distributions. In particular, we show that using a quantized tensor train (QTT) implementation of our sFSP method, problems admitting more than 100 million states can be efficiently solved.
Keywords: stochastic reaction networks; the Chemical Master Equation; Finite State Projection; stationary distribution; ergodicity; irreducibility; tensor trains
Mathematical Subject Classification (2010): 60J22; 60J27; 60H35; 65C40; 92E20
1 Introduction
Many intracellular reaction networks consist of biomolecular species that are typically present in low copy-numbers. The reactions involving these species fire intermittently at random times, rather than continuously. Hence deterministic descriptions of the reaction dynamics, based on Ordinary Differential Equations (ODEs), become highly inaccurate [1]. It is now well-known that macroscopic properties of the system can be heavily influenced by the intrinsic noise or randomness that arises due to the random timing of reactions [2]. Consequently stochastic formulations of the reaction dynamics, based on continuous-time Markov chains (CTMCs), has become a popular approach for studying the effects of intrinsic noise [3]. In this paper we provide a tool for the analysis of such models.
In the CTMC model of a reaction network, the state at any time is the vector of copy-number counts of all the species. When the number of network species is , the dynamics evolves on a discrete state-space which is a subset of the -dimensional nonnegative integer lattice and this subset must be large enough to include all the states that are accessible by the random dynamics. The effects of intrinsic noise on the reaction network are generally studied using the probability distribution of the random state-vector at time . It is known that the time-evolution of this probability distribution is given by a system of coupled ODEs, known as the Chemical Master Equation (CME) in the literature (see (2.7)). For each state in there is an ODE in the CME that captures the inflow and outflow of probability at that state. If this state-space is finite, then the CME is a finite system of linear ODEs which can in principle be solved to yield the probability distribution . However in many examples of biological interest, the state-space is infinite, making the CME impossible to solve. A common approach in such cases is to estimate the CME solution by computing the empirical distribution of the samples obtained by simulating the CTMC using Monte Carlo methods such as Gillespie’s Stochastic Simulation Algorithm (SSA) [4]. This simulation-based approach can be very time-consuming and the estimates suffer from statistical errors due to finitely many samples being used. In particular the low-probability events are sparsely sampled by Monte Carlo simulations, which can lead to incorrect representations of the CME solution. Such problems can be avoided by using the Finite State Projection (FSP) method developed by Munsky and Khammash [5], that directly solves the CME by truncating the state-space to a manageable size (see Section 2.2). The solution obtained is approximate but FSP provides an iterative way to ensure that the approximation error is within some pre-specified tolerance level.
The truncated state-space needed by FSP to solve the CME accurately is still exorbitantly large for many problems of interest. For example, consider a simple gene-expression network where ten protein species are interacting with each other. Typically each protein in a cell has copy-numbers of the order of several thousands. So even if we have a conservative upper-bound of 1000 on the copy-number of each protein, the size of the state-space required for FSP is of the order of , which is beyond the computational and storage capacity of modern day computers. This combinatorial explosion in the state-space size is often called the “curse of dimensionality” and it presents a major challenge in making the CME practically solvable. Several advanced numerical techniques have been developed that address this challenge by adapting the FSP. These techniques include Krylov Subspace approximations [6], Tensor-Train representations [7], and using sparse grids and aggregation methods [8]. Unlike these methods which attempt to solve the exact version of CME, there also exist a body of methods that aim to solve simplified versions of CME, which are derived by approximating the CTMC dynamics by a Stochastic Differential Equation (SDE) or a Piecewise-deterministic Markov Process (PDMP) (see [9, 10, 11]). Such dynamical approximations only hold for finite time-periods, and the assumptions on species copy-numbers and reaction propensities they require, are not always satisfied by networks encountered in systems biology.
For many biological applications, one is interested in the steady-state behavior which is captured by the stationary probability distribution to which the solution of the CME converges to as . For CTMCs whose state-space is finite and not too large, estimation of the stationary distribution is a simple linear-algebraic problem (see (1.1)). However in situations where the state-space is very large or infinite, this linear-algebraic problem cannot be practically solved, and we need to estimate by other means. The methods mentioned above for estimating only work over finite time-intervals and they would generally fail to provide an accurate estimate of the stationary distribution . The reason for this failure depends on the method being used. The dynamical approximations based on PDMPs or SDEs introduce an error that can become unbounded in the limit , and the Monte Carlo simulation based approach for estimating is highly undesirable due to statistical errors and the computational costs associated with these simulations over large time-intervals. The FSP algorithm also cannot be used for estimating because this method introduces an absorbing state to catch all the transitions that leave the truncated state-space (see Figure 1B). However in the limit , all the probability mass flows into this absorbing state, and so the obtained probability distributions are unable to capture the true stationary distribution . We revisit this point later in this section and also explain it in detail in Section 2.2.
The aim of this paper is to present a FSP-like method for accurately estimating the stationary distribution . This method also involves truncating the state-space but rather than solving a linear system of ODEs for probabilities over the truncated state-space (as in FSP), our method estimates the true stationary distribution by computing the stationary distribution of a suitably defined CTMC over the truncated state-space. As the latter step can be accomplished by solving a linear-algebraic system, rather than a system of ODEs, the computational complexity of our method is much lower than that of FSP. Consequently it can be successfully applied on a larger class of networks. We call our method the stationary Finite State Projection (or sFSP) algorithm and we provide theoretical arguments to establish its accuracy under certain stability conditions which are usually satisfied by networks in systems and synthetic biology. Even though sFSP can be applied on larger systems than FSP, the combinatorial explosion of state-space sizes still limits the range of applicability of sFSP severely. As was the case with FSP, this issue can be somewhat resolved by adapting sFSP to work with quantized tensor-train (QTT) representations [7], sparse grids and aggregation methods [8]. We illustrate this point with a computational example where sFSP is applied to the QTT representation of the CME (see Section 5). We remark here that QTT representations have already been successfully employed for obtaining approximations of the stationary distribution for reaction networks satisfying certain graph-theoretic criteria [12, 13]. However these criteria are highly-restrictive and it will become evident that the sFSP based approach is more versatile.
We now describe the problem of estimating stationary distributions in more detail. Henceforth let denote the size of any set , and let and denote the vector of all zeros and all ones respectively111The dimension of these vectors will be clear from the context.. The stochastic model of a reaction network (see Section 2.1) represents the dynamics as a CTMC over a discrete state-space . Such a CTMC can be described by its transition rate matrix (see [14]), whose diagonal entries are non-positive, off-diagonal entries are non-negative and all the rows sum to zero. The stationary distribution for this CTMC can be described by a non-negative vector222Throughout the paper we assume that vector and matrix indices start from [math] rather than . , which is in the left null-space of transition rate matrix , i.e.
[TABLE]
and its components sum to (i.e. ). Such a stationary distribution may not be unique and if then it may not even exist (see [14]). In our recent work, we have dealt extensively with the issue of computationally verifying the existence and uniqueness of the stationary distribution corresponding to the CTMC models for a large class of biomolecular reaction networks (see [15] and [16]). Assuming that the existence and uniqueness of the stationary distribution has been ascertained for the network, our aim here is to estimate numerically. We are primarily interested in situations where is infinite, and so the direct computation of using (1.1) is computationally impossible.
It is natural to try to estimate by solving a finite, truncated version of the linear-algebraic system (1.1). This truncated version can be obtained by first identifying a truncated state-space and then projecting the CTMC dynamics on this truncated state-space. Thereafter the stationary distribution for the projected CTMC, found by solving the corresponding linear-algebraic system of the form (1.1), serves as an estimate of the true stationary distribution . An important issue that arises here is how to handle the outgoing transitions from the truncated state-space, so that the obtained estimate of is accurate. In the FSP approach [5], these outgoing transitions are preserved but their target states are collapsed into a single absorbing state (see Figure 1B). This leads to the “probability leakage” problem which can be managed over finite time-intervals but not in the asymptotic regime. This problem manifests itself in the fact that the only stationary distribution for the projected CTMC would be the one that puts all the probability-mass at the absorbing state. Obviously this does not capture the true stationary distribution and hence modifications to the FSP approach are necessary to circumvent the probability-leakage problem. One such modification that has been tried is motivated by the use of “reflected” boundary conditions in the study of Fokker-Planck equations [12]. In this approach all the outgoing transitions from the truncated state-space are simply eliminated by setting their propensities to zero. It has been shown that this reflected version of FSP yields accurate estimates of the stationary distribution for some reaction network examples [17, 12]. However there is no theoretical guarantee that this approach will work well in general.
The method sFSP that we present in this paper modifies the FSP in another way. It preserves the outgoing transitions from the truncated state-space, but rather than channeling them to an absorbing state (as in FSP), it redirects them to a designated state within the truncated state-space (see Figure 1C). This modification is simple to implement and its appealing feature is that for a wide range of biomolecular reaction networks, we can theoretically guarantee that the stationary distribution of the projected CTMC converges to the actual stationary distribution as the truncated state-space expands to the full state-space . Moreover we derive bounds for the approximation error incurred by this approach, in terms of the outflow rate of all the outgoing transitions evaluated at the estimated stationary distribution (see Theorem 3.1). These results provide the theoretical basis for our method which expands the truncated state-space iteratively to recover a “good” approximation of . Note that our approach for estimating the stationary distribution is very different from the stochastic complementation approach proposed in [18]. This complementation approach is generally difficult to implement for infinite state-space CTMCs and it only yields the conditional stationary distribution which can then be used to derive upper and lower bounds for the true stationary probabilities. However such bounds are not guaranteed to be close to each other. In contrast, our method allows one to estimate the true stationary probabilities directly.
For our method sFSP to work we require that the original CTMC representing the reaction network satisfies a couple of stability conditions. The first condition is that the state-space needs to be irreducible i.e. all the states in must be accessible from each other via a sequence of positive-propensity reactions. The second condition is a Foster-Lyapunov criterion (see [19]) which ensures that the original CTMC is exponentially ergodic i.e. the solution of the CME converges to the stationary distribution exponentially fast. We elaborate these stability conditions in Section 3.1 and there we also explain how these conditions can be easily checked for a wide range of networks arising in systems and synthetic biology, using the computational procedures developed in our recent papers [15] and [16]. This makes the proposed sFSP method broadly applicable and of interest to the growing community of researchers working with stochastic models of biomolecular reaction networks.
This paper is organized as follows. In Section 2 we describe the stochastic model and the original FSP method [5]. In Section 3 we present and mathematically analyze our stationary Finite State Projection (or sFSP) algorithm. A simple implementation of sFSP is presented in Section 4 while its QTT implementation is presented in Section 5. These sections also include the computational examples that illustrate the respective implementations. Finally in Section 6 we conclude and discuss directions for future research.
2 Preliminaries
2.1 The stochastic model
We now formally describe the CTMC model of a reaction network. Suppose this network has species, called , and reactions of the form
[TABLE]
Here and are nonnegative integers denoting the number of molecules of species that are consumed and produced by the -th reaction. The state of the system at any time is the vector of molecular counts of all the species. When the -th reaction fires, the state is displaced by the stoichiometric vector whose -th component is . At any state , the rate of the -th reaction is , where is the propensity function for this reaction. Commonly mass action kinetics (see [3]) is assumed, where each is given by
[TABLE]
with the positive parameter being the associated rate constant. We model the reaction dynamics as a CTMC which jumps from state after a random waiting time which is exponentially distributed with rate , and this jump is in direction with probability . Formally this CTMC can be specified by its generator333The generator of a Markov process is an operator which specifies the rate of change of the probability distribution of the process (see Chapter 4 in [20] for details). defined as
[TABLE]
where is any bounded real-valued function on .
From now on we suppose that there is a nonempty state-space on which the CTMC evolves i.e.
[TABLE]
In other words, if at state , reaction has a positive probability of firing then the resulting state must also be in . As is at most countable, it can be enumerated. This means that we can find a one-to-one and onto map from to the set . Once such an enumeration is fixed, the set can be expressed as , where . Then the CTMC generator can be expressed as the transition rate matrix given by444Here we assume for convenience that all stoichiometry vectors (-s) are distinct.
[TABLE]
Let be the CTMC with this transition rate matrix and some initial state . For any state , let
[TABLE]
be the probability that the CTMC is in state at time . These probabilities evolve in time according to the Chemical Master Equation (CME) given by
[TABLE]
for each . Note that this system has as many ODEs as the number of elements in the state-space , which is generally infinite or very large.
Let be the probability distribution defined by
[TABLE]
for any . The vectorized form of w.r.t. enumeration is simply given by
[TABLE]
and using this form we can express the CME as
[TABLE]
If the number of states in is finite, then this first-order system can in principle be solved by exponentiating the matrix , i.e. the solution is given by
[TABLE]
where is the vectorized form of the probability distribution of the initial state . However this approach is infeasible for large state-spaces and in such cases, the Finite State Projection (FSP) method [5] can be used to approximately solve the CME (see Section 2.2).
In many biological applications, rather than the finite-time behavior, one is interested in the properties of the system after it has settled down, or in other words, the CTMC has reached a steady-state which is characterized by a stationary distribution satisfying (1.1), that is essentially a fixed-point for the CME (2.9). We say that the CTMC is ergodic if this fixed-point is unique and globally attracting in the sense that for any initial probability distribution , the solution of (2.9) satisfies
[TABLE]
where denotes the -distance555Generally ergodicity is defined using the total variation distance between probability distributions. However for a discrete state-space the total variation distance among probability distributions is exactly half of the distance computed using the norm. So we work with the norm in this paper. between probability measures and . Furthermore the CTMC is called exponentially ergodic if the convergence in (2.10) is exponentially fast, i.e. there exist positive constants and such that for any
[TABLE]
Here the constant may depend on the initial distribution but the constant does not (see [19] for example).
2.2 The Finite State Projection Algorithm
In the FSP method, approximate solutions of the CME (2.9) are obtained by restricting it to a truncated state-space. Suppose this truncated subset is given by a finite set of size . Using the same enumeration as in Section 2.1, we can express the set as . Letting to be the matrix formed by the rows and columns of matrix in the set , we approximate (2.9) by the -dimensional linear system
[TABLE]
The solution of this system is simply , where is the containing the components of vector in the set .
Let be the -dimensional vector of all ones. We assume that the initial state can only take values in and so . It is easy to check that all the rows of matrix have a non-positive sum, which implies that for any
[TABLE]
Results in [5] show that quantifies the “error” between the actual solution of CME and its approximation . For any fixed , this error decreases monotonically with increasing values of . Moreover as and the truncated state-space approaches the full state-space , we have . In the FSP algorithm of [5], the final time is fixed and the system (2.11) is solved in the time-interval with some truncated state-space . Thereafter the error is evaluated and if this value is below some pre-specified tolerance level , then the algorithm is terminated. Otherwise the truncated state-space is expanded to include more states and the same is process is repeated. After finitely many such iterations, the truncated state-space becomes large enough to ensure that the tolerance criterion is met.
Another way to formulate the FSP method is to consider the projected CTMC over the state-space , whose transitions among the states in are same as the original CTMC, but any outgoing transitions from the set are absorbed in the state (see Figure 1B), which serves as a proxy for all states in the set . Enumerating the elements of as , the transition rate matrix for this projected CTMC is given by
[TABLE]
where is the -dimensional column vector whose -th component is
[TABLE]
This choice of ensures that all the rows of matrix sum to [math] and hence is a valid transition rate matrix. Let be the solution of the CME (2.9) corresponding to rate matrix and with initial value . Then it can be shown that for any we can express as
[TABLE]
which proves that the FSP approximation error at time is exactly the amount of probability-mass that has been absorbed by the state in the time-interval .
One can show that typically for any fixed truncated state-space , we would have as , which says that eventually all the probability mass gets absorbed by the state . Therefore even if the original CTMC is ergodic and the solution of the CME (2.7) converges to as , the approximate solution obtained by solving the FSP system (2.11), will not be close to for large times and in fact converges to a vector of all zeros at . This is also evident from the stationary distribution that can be computed by finding a non-zero solution to the linear-algebraic system (1.1) corresponding to the matrix (see (2.14)). This would yield a stationary distribution of the form , which assigns all the mass to the absorbing state and hence cannot be close to . This shows that the FSP approach is not conducive for the estimation of stationary distribution .
3 The stationary FSP method
In this section we present our method sFSP for estimating the stationary distribution for the CTMC model of a reaction network. This is accomplished by constructing a projected CTMC over the truncated state-space and computing the stationary distribution of this new CTMC. Keeping the same notation as in Section 2.2, this projected CTMC over the truncated state-space is constructed by redirecting the transitions that leave to some designated state (see Figure 1C). Let the matrix and the vector be as in Section 2.2. Then the transition rate matrix for this CTMC is simply given by
[TABLE]
where corresponds to the address of the designated state (i.e. ) and is the vector whose -th component is and the rest are all zeros. Essentially, is formed by adding the non-negative vector to the -th column of matrix . All the rows of matrix sum to [math] and hence is a valid transition rate matrix and so our projected CTMC is well-defined. Our method sFSP estimates the stationary distribution by computing the finite-dimensional stationary distribution for the projected CTMC with transition rate matrix . Using , we can also compute the overall outflow rate at the estimated stationary distribution by
[TABLE]
This quantity will play a key role in bounding the sFSP approximation error whose direct computation is impossible.
3.1 Analysis of sFSP
The aim of this section is to demonstrate that under certain conditions, that are commonly satisfied by biological reaction networks, the sFSP approximation error can be made arbitrarily small by picking a truncated state-space , that is large enough. Moreover it is possible to check if is large enough by computing a convergence factor which is defined by suitably scaling the outflow rate . The main results of this section are collected in Theorem 3.1 and they provide the theoretical basis for our sFSP method.
Before we present our result we need to discuss some preliminary concepts. The state-space of the original CTMC is called irreducible if this CTMC has a positive probability of reaching any state in from any other state in , in a finite time. More formally, the state-space is irreducible, if for any we have for some . In our setting of reaction networks, this is equivalent to saying that between any two states there exists a sequence of positive-propensity reactions that takes the dynamics from to . For this to hold we must have and at each intermediate state the next reaction in the sequence () has a positive propensity of firing (). When only finitely many states are accessible by the reaction dynamics, irreducible state-spaces can be easily found by manipulating the transition rate matrix (see [21]). However when infinitely many states are accessible, finding irreducible state-spaces within the infinite lattice becomes a complicated task. In a recent work [15] we address this challenge and develop a computational procedure that can find all the irreducible state-spaces for a large class of biological reaction networks. In particular, for most networks of interest each irreducible state-space has the form666To obtain this form relabeling of species may be required.
[TABLE]
where is a finite set in , and are non-negative integers summing up to the total number of species . Here contains the dynamics of bounded species whose copy-numbers are required to satisfy a positive mass-conversation relation. A typical example is a gene-expression network where the gene of interest has many (say ) activity modes. To represent the dynamics we need to represent each such mode by a different network species, but all these species will be bounded and their copy-numbers will evolve in a finite set , because the gene of interest has a fixed copy-number (see the Pap-Switch example in Section 4.3 for instance). The species that are not bounded are free777Apart from free and bounded species, there may also exist another type of species, called restricted species, whose dynamics essentially mimics the dynamics of free species according to some affine map. However these restricted species can be easily eliminated to obtain a dynamically equivalent network and hence we ignore such species here (see [15] for more details). to have any copy-number and hence the state-space for their dynamics is taken to be the full non-negative integer orthant .
Note that the property of ergodicity (see Section 2.1) will obviously fail if there do not exist any stationary distributions or there exist more than one stationary distributions for the CTMC . If the state-space is finite, then its irreducibility is sufficient to guarantee that the stationary distribution exists uniquely and the CTMC is exponentially ergodic (see [21]). However when is infinite, its irreducibility can only guarantee the uniqueness of a stationary distribution but the existence of this distribution must be checked by other means, for example, using the results in [22] and [19]. In particular Theorem 7.1 in [19] guarantees the existence of a stationary distribution along with exponential ergodicity, if we can construct a function which is norm-like (i.e. as ) and for some , the following holds for all :
[TABLE]
where is the CTMC generator given by (2.4). This condition is called the Foster-Lyapunov criterion in the literature and it describes the tendency of the CTMC to experience a drift towards some finite set in the state-space with a force that is proportional to the distance from this finite set, measured according to . In [16] it is shown that for many biomolecular reaction networks, a linear Foster-Lyapunov function
[TABLE]
satisfying (3.19) can be constructed. Here is a positive vector which is chosen using simple Linear Programming and denotes the standard inner product in . Observe that for the linear function (3.20), the drift condition (3.19) is simply
[TABLE]
As demonstrated in [16], often for biological reaction networks the vector can be chosen in such a way that along with this drift condition, the following diffusivity condition is also satisfied - for some
[TABLE]
When (3.21) and (3.22) hold simultaneously, then in addition to exponential ergodicity, one can also guarantee other desirable properties like finiteness of all statistical moments of the stationary distribution and convergence of all the moments of the CTMC to their steady-state values as time approaches infinity (see Theorem 5 in [16]).
To study the sFSP approximation error we need to work with the norm prescribed by the Foster-Lyapunov function . For any signed measure on , this norm is given by
[TABLE]
Note that this norm is tighter than the norm because as . Let denote the boundary of the truncated state-space , which includes all those states in for which there exists a positive-propensity reaction that takes the dynamics outside , i.e.
[TABLE]
Based on the outflow rate given by (3.17), we define the convergence factor as
[TABLE]
where
[TABLE]
and is the designated state. Our next result will show that the convergence factor is a useful diagnostic tool to assess the approximation error of sFSP. Note that unlike the approximation error, can be explicitly computed from the sFSP output if the Foster-Lyapunov function is known. In situations where is unknown, the definition of can often be suitably modified to preserve its diagnostic purpose (see Remark 3.3).
We now come to the main result of our paper.
Theorem 3.1
Suppose that state-space is irreducible for the original CTMC with transition rate matrix , and there exists a Foster-Lyapunov function satisfying (3.19). Also assume that is a sequence of finite sets that is increasing (i.e. if ) and that covers the full state-space in the limit . Fix a designated state and let be the transition rate matrix of our projected CTMC with state-space , defined according to (3.16). Then we have the following:
- (A)
The stationary distribution for the projected CTMC exists uniquely.
- (B)
As , converges to the stationary distribution for the original CTMC, in the metric, i.e.
[TABLE]
- (C)
There exists a positive constant such that for any
[TABLE]
where is the convergence factor defined by (3.24).
- (D)
Suppose that the Foster-Lyapunov function has the linear form (3.20) and the positive vector is such that both (3.21) and (3.22) are satisfied. Furthermore assume that the sequence of sets grows uniformly w.r.t. function which means that for some constant we have
[TABLE]
where is the boundary of defined by (3.23). Then there exists a constant for which the converse of (3.27) also holds, i.e. for each
[TABLE]
Furthermore, the convergence factor converges to [math] as .
Remark 3.2
It will become evident from the proof that if the Foster-Lyapunov function and constants in (3.19) are known, then a constant satisfying part (C) can be explicitly computed using the results in Meyn and Tweedie [23]. Hence part (C) provides a computable upper-bound for the approximation error . Similarly the constant satisfying part (D) may be explicitly computed from constants in (3.21) and (3.22), and the constant that appears in (3.28). The tightness of the error bounds obtained from these explicitly computable constants remains to be investigated. Nevertheless parts (C) and (D) are useful in demonstrating that up to a constant, the magnitude of the uncomputable approximation error can be assessed by computing the convergence factor . In other words, if then , and similarly if then , where and are the optimal constants for which parts (C) and (D) hold.
Remark 3.3
Note that computation of the convergence factor (3.24) requires knowledge of the Foster-Lyapunov function which is undesirable from the point of view of applications. However it is possible to circumvent this problem, if one has information about the form of and the shape of finite sets . For this one needs to pick a sequence such that for some constants
[TABLE]
holds for each , with defined by (3.25). Then one can define the convergence factor as
[TABLE]
with the outflow rate given by (3.17), and parts (C) and (D) will hold with the substitutions, , and . For example, if has the linear form (3.20), then one can define in the same way as but with replaced by any norm on .
Proof. We start by proving part (A). The stationary distribution for the projected CTMC certainly exists because the transition rate matrix is finite (see [21]). This stationary distribution can be found by solving the linear-algebraic system (1.1) with transition-rate matrix . We now prove by contradiction the uniqueness of this stationary distribution. Suppose that this uniqueness does not hold. Then there would exist at least two disjoint non-empty irreducible state-spaces (say and ) for the projected CTMC within the state-space . This implies that if the projected CTMC starts in set then it remains in this set for all times, and there is a positive probability for this CTMC to reach any state in from any other state in in a finite time. The same holds true for set . Certainly one of these sets, say , will not contain the designated state but this leads to a contradiction due to the following reasons. Since the state-space is irreducible for the original CTMC, there exists a sequence of reactions that takes the original CTMC from any state to the designated state with a positive probability. If all the intermediate states that arise in this reaction path (recall -s from above) lie within the set , then the same sequence of reactions will also take the projected CTMC from state to state , which is a contradiction because is an irreducible state-space not containing . On the other hand if one of the intermediate states lies outside , then the last reaction, say , in the sequence that takes the dynamics outside will be redirected to the designated state in the projected CTMC and hence again we have a contradiction because is a positive-probability sequence of reactions that takes the projected CTMC from state to state . Therefore the stationary distribution for the projected CTMC is unique, and this completes the proof of part (A).
We now prove part (B). Clearly the assertion of part (B) is trivial when the full state-space is finite and so we assume that is infinite from now on. Let be a sequence of sets as stated in the proposition and let be an enumeration of satisfying
[TABLE]
Such an enumeration exists because is an increasing sequence of sets that cover the set in the limit . Note that as each is a finite set, condition (3.31) ensures that
[TABLE]
Now consider the -valued, one-dimensional process given by for each , where is the original CTMC with transition rate matrix and generator (see (2.4)). As is a one-to-one and onto map, the process is also a CTMC and its generator is given by
[TABLE]
where is a bounded real-valued function on and is the bounded real-valued function on defined by .
Irreducibility of state-space for implies the irreducibility of state-space for . Let be the norm-like Foster-Lyapunov function satisfying (3.19) and define the function by . Then using (3.32) and (3.19) we can deduce that is a norm-like function satisfying
[TABLE]
Therefore is a Foster-Lyapunov function for CTMC with generator and hence this CTMC is exponentially ergodic due to Theorem 7.1 in [19]. Let and be the probability distributions on and defined by
[TABLE]
Then is the stationary distribution for the CTMC and is the stationary distribution of this CTMC projected onto the finite state-space by redirecting all the outgoing transitions to the designated state . Theorem 3.3 in [24] proves
[TABLE]
using resolvent forms (see (3.40)). This limit is equivalent to (3.26) and this proves part (B).
We will now prove part (C). Without loss of generality we can assume that . Define an infinite vector
[TABLE]
whose first elements are , where denotes the northwest sub-matrix of . Recall that matrix is given by (3.16) and the outflow rate is defined by (3.17). As we can write as
[TABLE]
which shows that the vector has only one non-zero entry which is equal to and it is at the position corresponding to the designated state . Since we have which implies that
[TABLE]
One can check that all entries of the infinite vector are non-negative and only those entries are non-zero that correspond to the states in the boundary set (see (3.23)) of . Therefore, viewing as a signed measure over , we can express it as
[TABLE]
where and are probability measures on , supported on and respectively. With a slight abuse of notation, we will denote the vector-version of also as .
Define the sFSP approximation error in vector form as
[TABLE]
and since we get the following equation from (3.37)
[TABLE]
One can verify that is the unique solution of this linear system with the constraint . For any , let denote the -resolvent matrix corresponding to the transition rate matrix . It is defined by
[TABLE]
where denotes the identity matrix. It is known (see [24]) that is a positive matrix satisfying , and
[TABLE]
One can regard as the transition matrix of a discrete-time Markov chain over whose unique stationary distribution is .
Expressing the Foster-Lyapunov function as the vector we can write the drift condition (3.19) as
[TABLE]
This relation along with (3.41) and the positivity of implies
[TABLE]
Letting and we obtain
[TABLE]
Note that . Theorem 6.1 in [23] shows that we can explicitly compute constants and , such that for any probability distribution over we have
[TABLE]
where denotes the -th power of the matrix . Transposing (3.39), multiplying both sides by and using (3.41) and (3.38) we get
[TABLE]
One can write as
[TABLE]
where is the solution to
[TABLE]
for . This solution can be expressed as
[TABLE]
and using (3.42) we get
[TABLE]
Therefore
[TABLE]
where . As and are probability distributions supported on and , we have (see (3.25)). This proves part (C) of the theorem.
We now prove part (D). Here we assume that the Foster-Lyapunov function has the linear form (3.20) and both (3.21) and (3.22) are satisfied. Note that by rescaling the positive vector in (3.20) if necessary, we can assume that
[TABLE]
As from condition (3.21) we obtain
[TABLE]
for each . Transposing (3.39), multiplying both sides by vector and taking absolute values we get
[TABLE]
Using (3.44) we can upper-bound the l.h.s. as
[TABLE]
Since is given by (3.38), with and being probability distributions supported on and respectively, we can lower-bound the r.h.s. of (3.45) as
[TABLE]
The uniform growth condition (3.28), along with the fact that does not depend on , ensures that there exists a positive constant such that
[TABLE]
for each , and hence obtain the lower-bound
[TABLE]
This relation along with (3.45) and (3.46) yield
[TABLE]
which is sufficient to prove (3.29) as .
We now prove the second assertion of part (D), i.e. as . For this we first demonstrate that the square of the linear Foster-Lyapunov will also satisfy the drift condition (3.19). To see this note that for any
[TABLE]
Using (3.21) and (3.22) we obtain
[TABLE]
As is a semi-norm, the quadratic term will dominate the linear term for all outside some compact set and hence the drift condition (3.19) will be satisfied by function for some constants . This drift condition also ensures that (see [19]) there exists a constant such that
[TABLE]
for each . Now using Cauchy-Schwarz inequality we get
[TABLE]
As , part (B) shows that and hence as well. Now (3.29) proves that and this concludes the proof of the theorem.
3.2 The sFSP Algorithm
Theorem 3.1 proves that under certain conditions, the sFSP approximation error, measured in a certain norm, converges to [math] as and the truncated state-space expands to the fully state-space . Moreover for any the magnitude of the approximation error can be judged by computing the convergence factor defined according to (3.30) with the sequence chosen as in Remark 3.3. These results form the basis of our stationary Finite State Projection (sFSP) algorithm, that is presented as Algorithm 1. This algorithm takes as input a -species reaction network , specified as a set of reactions with propensity functions and stoichiometric vectors . It is required that the CTMC describing the reaction kinetics admits a Foster-Lyapunov function satisfying (3.19) and its state-space is irreducible. These conditions can be checked using the results in [15] and [16] as discussed before.
Algorithm 1 starts by picking an increasing sequence of finite state-space truncations as in Theorem 3.1, a sequence as in Remark 3.3, and a designated state . Thereafter for each iteration cycle , the transition rate matrix for the projected CTMC over the truncated state-space is constructed and its stationary distribution is found by solving the linear-algebraic system (1.1) for matrix . Next the outflow rate and the convergence factor are computed. If this convergence factor is below an acceptable threshold level (chosen in step 4 of Algorithm 1), then sFSP terminates after returning as the estimate of the true stationary distribution . Otherwise if , then the algorithm goes into the new iteration cycle with the expanded truncated state-space .
4 sFSP Algorithm: Simple Implementation
In this section we present the simple implementation of sFSP akin to to the classical FSP [5], where the multi-dimensional state-space is explicitly enumerated, and accordingly the transition rate matrix for the projected CTMC is constructed and its stationary distribution vector is computed. The performance of sFSP depends crucially on the choice of finite state-space truncations -s and their enumerating functions -s. We now discuss these choices for our implementation of sFSP.
4.1 State-space enumeration and truncation
The basic ingredient of our state-space enumeration strategy is the Cantor Pairing function (see [25]) which is the bijective map between to defined by
[TABLE]
Under this bijection, the elements in are mapped to by moving along the anti-diagonals, which are the straight lines given by (see Figure 2A). This map is easy to invert and for any , can be computed as and , where
[TABLE]
Henceforth we define as the identity map on . By composition, one can extend the Cantor function to obtain a bijection from to for any positive integer . Such a bijective map can be defined recursively as
[TABLE]
Similarly the inverse of this map can also be defined recursively as
[TABLE]
where .
Consider the situation where the irreducible state-space has the form (3.18) with and . In this case, is just the -dimensional non-negative integer orthant and we enumerate it using the Cantor function . An explicit formula for can be obtained (see [25]) as
[TABLE]
where denotes the binomial coefficient. This formula shows that for any with , the following set
[TABLE]
is non-empty, and we call this set a trapezoidal truncation of with left cut-off point and right cut-off point . For , we plot such a set in Figure 2B and it simply consists of all the states whose component-sum is between and . This may not be exactly true is higher-dimensions () but still one can think of a trapezoidal truncation as the set of states whose component-sum is within certain bounds. Note that in our setting of reaction networks, the component-sum of a state represents the total molecular count of all the species. In many biomolecular reaction networks this total molecular count is within certain tight bounds even though each species can individually have high copy-number variation. This is mainly because the species are often in competition with each other, through mechanisms such as mutual repression or interconversion, which ensures that the total molecular count is tightly regulated. This property makes trapezoidal truncations very appealing for our purpose of estimating stationary distributions. This point is nicely illustrated by the Toggle-Switch example considered in Section 4.3.2.
We now consider the situation where the irreducible state-space has the form (3.18) for some finite non-empty set . Let and we fix an enumeration of this set as . This enables us to define an enumeration over the full state-space by
[TABLE]
where and . One can easily see that this map is a bijection between and , and its inverse is given by
[TABLE]
where is the remainder in the division of by and is the corresponding quotient. For the state-space we define the trapezoidal truncation as
[TABLE]
where and are non-negative integers satisfying as before.
We now come to the definitions of finite state-space truncations -s and their enumerating functions -s. Let and be monotonic sequences of non-negative integers that satisfy for each along with the limits
[TABLE]
For each we define the finite state-space truncation as . Note that monotonicity of the left and right cut-off sequences and along with (4.52) ensures that is an increasing sequence of finite sets that covers the full state-space in the limit , as demanded by the sFSP Algorithm 1. Assuming that the Foster-Lyapunov function has the linear form (3.20), we can choose the sequence (see Remark 3.3) in step 2 of Algorithm 1 as .
In the case where the full state-space is the non-negative integer orthant , the size of the truncated state-space is
[TABLE]
and we enumerate the set using the map given by
[TABLE]
Based on this enumeration the transition rate matrix for the projected CTMC over the truncated state-space (see step 5 in Algorithm 1) can be constructed with Algorithm 2. In the other situation where the irreducible state-space has the form (3.18) for some finite non-empty set with elements, the size of the truncated state-space is
[TABLE]
and we enumerate the set using the map given by
[TABLE]
where is the map defined by (4.49). The transition rate matrix for the projected CTMC over the truncated state-space can be constructed using Algorithm 2 with some minor changes.
4.2 Implementation Details
We now provide some details on our computer implementation of sFSP Algorithms 1 and 2, and discuss the related issues. Note that the size of the truncated state-space can be very large, causing problems in storing the transition rate matrix , and also in solving the linear-algebraic system (1.1) to obtain . Note however that out of entries in matrix , at most entries can be non-zero, where is the number of reactions which is typically much smaller than . Hence is an extremely sparse matrix and this sparsity can be exploited for storing matrix and for finding the vector .
Another issue that commonly arises is that for states with large components, the propensity functions take very high values which causes the matrix to have very large entries. This creates numerical issues while solving the linear-algebraic system (1.1) for computing . A simple way to circumvent this problem is to scale the matrix by its diagonal entries and apply the same scaling to the solution of the linear-algebraic system to recover . In other words, matrix is constructed by modifying Algorithm 2 by setting to in step 7 and by replacing with in steps 12 and 14. Such a scaling is allowed because the state-space is irreducible for the original CTMC and hence any cannot be an absorbing state and so is nonzero. While constructing matrix we must also store the values for . These values help in recovering from the solution of the linear-algebraic system solved in step 6 of Algorithm 1
[TABLE]
Of course is then normalized (step 7 of Algorithm 1) to ensure that its component-sum is and it represents a valid stationary distribution.
In our setup we implement the main sFSP method (Algorithm 1) in Matlab but we delegate the construction of the transition rate matrix to a C++ program that implements Algorithm 2. Once constructed, this matrix is imported into the sFSP Matlab program as a sparse matrix. The linear-algebraic system (1.1) for this matrix is solved by computing the eigenvector corresponding to the smallest-magnitude eigenvalue (i.e. [math]) using the eigs function in Matlab. This function performs an Arnoldi iterative procedure [26] to efficiently compute a subset of eigenvalues and eigenvectors for large sparse matrices. It also allows us to pass a starting vector for the Arnoldi procedure. In our implementation we use the stationary distribution vector obtained in iteration as the starting vector in iteration 888In the first iteration , the starting vector is chosen to correspond to the uniform stationary distribution over the first state-space truncation . For the sFSP implementation considered in this section, we use the scaled version of matrix as described above.
4.3 Computational Examples
In this section we illustrate our simple implementation of sFSP using examples from systems biology. In all the considered examples, sFSP is applicable because with results in [15] and [16] we can verify that the theoretical conditions required by sFSP (see Theorem 3.1) are satisfied. Moreover for all the examples, we fix the acceptable threshold level (see step 4 of Algorithm 1) to be , and we specify the increasing family of trapezoidal state-space truncations via a pair of monotonic cut-off sequences and that satisfy for each along with the limits (4.52). The choice of these sequences can have a big influence on the overall performance of sFSP and especially the number of iterations it needs to terminate. Recall that and can be interpreted as bounds on the component-sum of states in the trapezoidal truncation (see Section 4.1). Therefore we can use crudely estimated values of the mean and standard deviation of the state component-sum at stationary, as a guidance for selecting these cut-off sequences. These crude estimates can be obtained with a few sample trajectories of the original CTMC generated with Gillespie’s SSA [4].
Since the two main steps of sFSP, viz. constructing the rate matrix and solving the linear-algebraic for , are performed on two separate computing platforms (C++ and Matlab), we will report the CPU times999All the computations for this simple implementation of sFSP were performed on an Apple machine with 2.9 GHz Intel Core i5 processor. for both these steps individually for each iteration . The total CPU time needed for an iteration is approximately the sum of these two times, and we will plot it along with the convergence factor , as a function of the iteration counter , to show how they change as the truncated state-space expands in size. For the computation of convergence factors we choose in step 2 of Algorithm 1 (see Section 4.1).
4.3.1 Gene-expression network
Our first example is the gene-expression network given in [27], where molecules of the messenger RNA or mRNA (denoted by ) are created by a gene, and these mRNA molecules catalytically produce molecules of some protein (denoted by ). Molecules of both these species can degrade spontaneously. This two-species network has the following four reactions:
[TABLE]
The propensity functions are given by mass-action kinetics (2.3) and -s denote the associated rate constants. We assume that the values of these rate constants are given by , , and .
For the CTMC model of this network, the state-space is irreducible and so it can be enumerated with the Cantor Pairing function (see Section 4.1). We apply sFSP to this network to obtain an estimate of the stationary probability distribution . The cut-off sequences and that define the trapezoidal truncation for iteration are chosen as
[TABLE]
where and , are crudely estimated values of the mean and standard deviation of the state component-sum at stationarity, and these are obtained with a few SSA-generated trajectories of the CTMC. The designated state we select for sFSP is , which corresponds to [math] mRNA molecules and protein molecules.
The performance of sFSP on the gene-expression network is summarized in Table 1, where for each iteration , the cut-off values ( and ), the truncated state-space size (), the convergence factor and the CPU times for the two main sFSP steps are provided. One can see that sFSP terminated in iterations and overall it required 615 seconds of CPU time. To assess the accuracy of sFSP, we also estimate using CTMC trajectories simulated with SSA in the time-interval . This SSA-based estimation was implemented in C++ and it needed 7246 seconds of CPU time which is much higher than the 615 seconds needed for sFSP.
Note that the size of the truncated state-space is increasing linearly with and hence the size of the rate matrix is increasing quadratically with . So we would expect the CPU time for constructing and solving the linear-algebraic system for , to also increase quadratically with . However this is not the case and the two CPU times only increase linearly (see Table 1). This is because matrix is extremely sparse with only non-zero entries, where is the number of reactions. This sparsity is exploited in our implementation of sFSP for both constructing the matrix and solving the linear-algebraic system.
The linear increase in the required CPU time can be seen from Figure 3A. Here the convergence factor is also plotted in log-scale and the almost linear decay shows that the convergence factor drops exponentially to zero as the truncated state-space expands iteratively. Such an exponential decay is perhaps due to the fact that the joint stationary distribution is unimodal, as indicated by the contour plot in Figure 3B. This unimodality is also visible from the marginal distribution plots in Figure 3C. These sFSP-estimated marginal distribution plots are compared with the SSA-estimated distributions in Figure 3C and they show a close match.
4.3.2 Toggle-Switch network
We now consider the example of the genetic toggle-switch network proposed by Gardner et. al. [28]. This network has two species and that are competing by repressing each other’s production. This repression is modeled through propensities given by nonlinear Hill functions [29]. The network has four simple reactions
[TABLE]
where the propensity functions -s are given by
[TABLE]
Here and denote the copy-numbers of and respectively. For our computations we set , , , , and .
For the CTMC model of this network, the state-space is irreducible, and we apply sFSP with trapezoidal truncations using the cut-off sequences and
[TABLE]
at iteration , where and , are crude SSA-based estimates of the mean and standard deviation of the state component-sum at stationarity. The designated state we select for sFSP is , which corresponds to [math] molecules of and molecules of .
The performance of sFSP on the Toggle-Switch network is summarized in Table 2, where for each iteration , the cut-off values, the truncated state-space size, the convergence factor and the CPU times for the two main sFSP steps are provided. In this example, sFSP terminated in iterations and overall it required 322 seconds of CPU time. In comparison, the SSA-based estimation of , implemented in C++, using CTMC trajectories simulated in the time-interval , needed 42192 seconds of CPU time.
As in the previous example, the required CPU time increases almost linearly with iteration and the convergence factor decreases slowly for the first four iterations and then plummets to nearly [math] in the fifth iteration (see Figure 4A). The contour plot for the joint stationary distribution estimated by sFSP is shown in Figure 4B and it indicates that this distribution is bimodal with each mode corresponding to one of the species being dominant. In Figure 4C we plot the sFSP-estimated marginal stationary distributions for the copy-numbers of the two species and compare them with the SSA-estimated marginal stationary distributions. One can clearly see that unlike sFSP, SSA fails to adequately capture the stationary distribution in the low-probability regions of the state-space even though a large sample of size million is used. These statistical errors and other numerical issues associated with computing very low probabilities, may explain the slight discrepancy in the sFSP and SSA estimated marginal distribution for species (see Figure 4C).
4.3.3 Pap-Switch network
We now consider the Pap epigenetic switch whose finite-time CME was solved in [5] with the FSP method. This stochastic switch is responsible for deciding whether or not E. coli will develop hairlike structures called pili. The Pap-switch network is illustrated in Figure 5A and it consists of a single pap operon that can exist in four states and determined by the binding sites occupied by the leucine-responsive regulatory protein (LRP) molecules. When the operon is in state , it can produce a local regulatory protein called PapI which represses the unbinding of the LRP molecules from the operon binding sites. This protein is allowed to degrade spontaneously at a certain rate. As in [5] we assume that the number of LRP molecules is fixed at . The dynamics of the copy-numbers of the five species and in the Pap-Switch network can be modeled with reactions described in Table 3.
For the CTMC model of this network, the state-space is irreducible, where
[TABLE]
is the finite set which contains the dynamics of the copy-numbers of the four operon states and . The copy-numbers of can take values in the whole set of non-negative integers . The state-space of the form can be enumerated using the function (see (4.49)) with . Similarly the trapezoidal truncations -s can be defined as (4.51). In our application of sFSP for this network we construct these truncations using the cut-off sequences and specified by
[TABLE]
at iteration , where and , are coarse SSA-based approximations of the mean and standard deviation of the copy-numbers. Note that due to the low copy-numbers involved we fix the left cut-off point to be zero for all the iterations. Also the designated state we select for sFSP is , which corresponds to the operon being in state and having [math] molecules.
The performance of sFSP on the Pap-Switch network is summarized in Table 4, where for each iteration , the cut-off values, the truncated state-space size, the convergence factor and the CPU times for the two main sFSP steps are provided. For this network, sFSP took iterations to terminate and overall it required only seconds of CPU time. By contrast, the SSA-based estimation of the stationary distribution, implemented in C++, with CTMC trajectories generated in the time-period , required 134 seconds of CPU time.
In this example, the sizes of the truncated state-spaces are very small and so sFSP executes very quickly, causing the CPU times to vary non-monotonically with iteration while the convergence factor decreases almost exponentially (see Figure 5B). In Figure 5C we plot the sFSP-estimated stationary distributions for copy-numbers at each operon state and . These are compared with the corresponding SSA-estimated stationary distributions and it can be seen from Figure 5C that the match is almost perfect.
4.3.4 Self-activated gene expression
We end this section with a simple but instructive example borrowed from [30]. Consider a gene whose protein output can activate its own expression through a nonlinear feedback loop. A simple reaction network model for this would be
[TABLE]
where the propensity function for the degradation reaction is linear while the propensity function for the production reaction is given by a Hill-type function
[TABLE]
Here denotes the copy-number of protein . For our computations we set , , , and .
As there is only one species, the trapezoidal truncation is simply the set . We choose and at iteration , and apply sFSP on this example with designated state [math]. The results are shown in Figure 6. Note that the stationary distribution is bimodal, with a small peak around and a larger peak around . Most of the stationary probabilities are concentrated in two disjoint regions and around the two peaks. Observe that the end-points and of these two regions are inflection or turning points for the behavior of the convergence factor with increasing iteration counter or expanding truncated state-space . The convergence factor decays exponentially before and after , but in the intermediate region it shows a gradual increase. Further computations reveal that for iterations corresponding to this intermediate region, the outflow rate remains approximately constant, and so the convergence factor increases slowly due to scaling by the cut-off value . This relationship between bimodality of the stationary distribution and non-monotonicity of the convergence factor is very interesting and should be investigated in a greater detail elsewhere.
5 sFSP Algorithm: QTT Implementation
The second implementation is motivated by the recently developed Quantized Tensor-Train (QTT) version of FSP [7], which works with QTT representations of the transition rate matrix and its stationary distribution vector. The use of such representations expands the range of applicability of sFSP and we demonstrate this by applying sFSP on a network which is much larger than the networks considered in Section 4.3.
5.1 The CME in QTT form
A tensor is essentially a multi-dimensional generalization of a two-dimensional matrix or a one-dimensional vector. A -dimensional tensor of size , represents a structured collection of real numbers given by
[TABLE]
Each dimension of this tensor is also called its mode, and denote the mode sizes. The tensor can also be viewed as a real-valued function over the -dimensional hyper-rectangle
[TABLE]
which is a subset of the non-negative integer orthant .
Tensors are particularly well suited to express the CME since the system already has a physical interpretation as tensors, where each species corresponds to one tensor mode and for any mode , its size serves as the strict upper-bound for the allowable copy-numbers for species . As in FSP [5], consider a CME over the truncated state-space (see (2.11) for example). The probability distribution of the random state-vector at time can be represented as a -dimensional tensor of size and the matrix that captures its rate of change can be represented as a -dimensional tensor of size .
The tensor train (TT) representation of a -dimensional tensor with size is given by
[TABLE]
where and for each , is a three-dimensional tensor with size . The tensors to are called the core tensors and are referred to as tensor ranks. The TT-representation can potentially provide a high compression of the tensor, especially if the ranks are low. Most basic matrix-vector operations (like matrix-vector product, dot product, outer product etc.) can be applied directly on the compressed TT format (for details see [31]). The complexity of these basic operations as well as the storage cost can be bound by where and . Any tensor can be decomposed into the TT format by the TT-SVD algorithm [31], which is based on the Singular Value Decomposition (SVD) for matrices. The TT format can be extended to the quantized tensor train (QTT) format which provides another layer of compression by dividing each mode of the tensor into several virtual modes that are then further compressed using tensor trains (see [32] and [33]).
In [7] the authors show how the matrix for the CME (2.11) over the truncated state-space (5.54) can be directly constructed in the QTT format and thereafter used for efficiently solving the FSP and obtaining the transient CME solution . The main observation underlying the QTT construction of is that one can think of this matrix in terms of the spatial shift operator , shifting a probability density tensor by the stoichiometry vector for reaction , and a multiplication operator , multiplying a probability density tensor by the propensity function for reaction , i.e.
[TABLE]
for any . Using these operators along with the identity operator , the matrix can be expressed as
[TABLE]
and this form can be exploited for efficiently constructing the QTT representation of . As explained in [7], for mass action kinetics, the operator can be constructed by taking the outer products of state-vectors in and the appropriate vector of ones , while the operator can be constructed as a matrix of zeros with a shifted diagonal of ones.
5.2 Implementation Details
In our QTT implementation of sFSP, we use a similar expression as (5.55) to construct the QTT representation of the transpose of the transition rate matrix (see (3.16)) for our projected CTMC over the truncated state-space , where all the outgoing transitions are redirected to the designated state of all zeros. Using the QTT representation of , the corresponding linear-algebraic system (1.1) is directly solved in QTT format to yield the stationary probability distribution in QTT format.
For solving the linear-algebraic system, we use the inverse iteration approach (see [34]) which is known to have very good convergence properties and work well with tensor algebra [35]. In this approach a linear system of the form , for a singular matrix , is solved by iteratively solving the linear systems
[TABLE]
starting with some initial guess . The solution is suitably normalized before commencing iteration . Generally this procedure requires very few iterations (like 2 or 3) to converge, and this convergence can be judged by checking that the distance between subsequent solutions is below some threshold level .
In our setup we implement the QTT version of sFSP method (Algorithm 1) in Matlab, using Version 2.2 of the qtt-toolbox developed by I. Oseledets, S. Dolgov, V. Kazeev, O. Lebedeva, and T. Mach [36]. In particular the linear systems that arise in the inverse iteration procedure are solved using the function dmrg_solve3.m from this toolbox. For each sFSP iteration , the initial guess for the inverse iteration procedure is chosen based on the estimate obtained in iteration , as mentioned in Section 4.2. For the computational example we consider next, we found that only two inverse iterations were always sufficient to yield a convergent solution of the linear-algebraic system (1.1) for the threshold level .
5.3 A Computational Example
We now illustrate our QTT implementation on a toy example with features similar to the Repressilator network given by Elowitz and Leibler [37], which has three gene-expression modules (say A, B and C) that interact by mutual inhibition of each other in a cyclic fashion i.e. represses , represses and represses (see Figure 7A). This inhibition is carried out by the corresponding proteins (, and ) and it is achieved by enhancing the rate at which the inhibited gene becomes inactive (OFF) from an active (ON) state. Each protein also activates its own production by increasing the rate at which its gene switches ON from the OFF state. The mRNAs (, and ) associated with the genes are only transcribed when the corresponding gene is in the ON state. Overall this network consists of species and reactions described in Table 5. These species include the indicators for the three genes being in the ON state (, and ), the three mRNAs (, and ) and finally the three proteins (, and ).
For the CTMC model of this network, the state-space is irreducible, where
[TABLE]
is the finite set which contains the dynamics of the copy-numbers of the three genes being in the ON state. The copy-numbers of all the mRNAs and proteins can take values in the whole set of non-negative integers . We apply sFSP on the 3-gene network with the finite truncated state-space for sFSP iteration chosen as (see (5.54)) with
[TABLE]
Here and denote the strict upper-bounds for the copy-numbers of all the mRNAs and proteins respectively. The convergence factor is computed for this example using in step 2 of Algorithm 1. Due to limitations posed by the qtt-toolbox and our computational hardware, we fix the acceptable threshold level (see step 4 of Algorithm 1) to be instead of used previously.
The performance of sFSP on this triple-repressor network is summarized in Table 6, where for each iteration , the upper-bounds ( and ), the truncated state-space size (), the convergence factor and the CPU times are provided. One can see that sFSP terminated in iterations and overall it required around 168 minutes of CPU time101010All the computations for this QTT implementation of sFSP were performed on a Lenovo T440 machine with 1.6 GHz Intel i5-4200U processor with 8GB of RAM. Note that the copy-numbers of all the species are relatively small in this example. However due to the large number of species, the size of the final truncated state-space is several times larger than the truncated state-spaces encountered in the examples considered before, for the simple implementation of sFSP. To assess the accuracy of sFSP, we also estimate using CTMC trajectories simulated with SSA in the time-interval . As in the previous examples, we plot the CPU times and the convergence factors at all the sFSP iterations in Figure 7B, the contour plots for the various joint stationary distributions estimated by sFSP in Figure 7C, and the estimated marginal stationary distributions for the all the species in Figure 8. These marginal stationary distributions are also compared with the corresponding SSA-estimated marginal stationary distributions and one can see that the match is quite good.
The SSA-based estimation with trajectories needed around 117 minutes of CPU time, based on a C++ implementation, which is slightly faster than sFSP (168 minutes). However we must note that even though this SSA-based estimation captures the marginal distributions very well (see Figure 8), it is unable to capture the full stationary distribution because the state-space is high-dimensional and the size of the final truncated state-space for sFSP suggests that the support of the true stationary distribution is much larger (130 million) than the number of SSA samples ( million) being used for the estimation. To illustrate this point, we compute the distance between the sFSP estimated stationary distribution and the stationary distribution estimated with and SSA samples. The results are shown in Table 7 along with the associated CPU times for generating the SSA samples. Notice that as the number of SSA samples increases, the distance decreases sharply, which strongly suggests that sFSP is an accurate approximation of the true stationary distribution . However this distance is significant when is estimated with million SSA samples, which implies that is quite inaccurate. If we use SSA samples to estimate then the accuracy improves but the total CPU time required is approximately 18 hours, that is times larger than the time needed for sFSP.
6 Conclusion
In this paper we presented a new method for estimating the stationary probability distributions of continuous-time Markov chain (CTMC) models of reaction networks based on suitable truncations of the CME. The method which we call the stationary Finite State Projection (sFSP) algorithm is similar to the Finite State Projection (FSP) algorithm[5], with the crucial difference being that instead of introducing an absorbing state, we redirect all the outgoing transitions from the truncated state-space to a designated state within the truncated state-space (see Figure 1C). This simple modification creates a projected CTMC over the truncated state-space, whose stationary distribution can be obtained by solving a finite linear-algebraic system. We provided theoretical arguments to establish that this stationary distribution estimated from the projected CTMC is unique, converges to the true stationary distribution as the truncated state-space expands to the full state-space and for any truncated state-space the error between the estimated stationary distribution and the true stationary distribution can be assessed by computing the overall rate of outgoing transitions at the estimated stationary distribution (see Theorem 3.1). These results form the basis of our sFSP method. We illustrated the efficiency and accuracy of this method using several examples. These examples indicated that sFSP can easily outperform the stochastic simulation-based approach for estimating the stationary distribution, both in terms of computational speed as well as accuracy. This is not unexpected, as stochastic simulations are expensive to perform over large time-intervals, and the stationary distribution they estimate suffers from statistical errors that can be significant in regions of the state-space where the probabilities are extremely low. These issues do not arise in sFSP and this makes it an appealing method for estimating stationary distributions of CTMCs representing reaction networks.
There are several ways to improve and extend sFSP. Like FSP, this method is iterative in nature and the number of iterations it requires to converge depends on the specifics of the implementation of sFSP. In this paper we discussed two such implementations. In the first implementation the state-space was explicitly enumerated with Cantor pairing functions and then truncated in trapezoidal shapes (see Section 4), while the second implementation was based on the recently developed quantized tensor train (QTT) version of CME where each state-space truncation is a hyper-rectangle (see Section 5). Both these implementations will benefit from better state-space truncation schemes that adapt to the problem at hand. One way to do this would be to use Lyapunov function theory or use stationary moment bounds to construct optimal state-space truncations (see [18], [38] and [16]). Observe that unlike FSP which solves a linear system of ODEs, sFSP only requires solving a linear-algebraic system which is computationally much easier. Hence sFSP can handle a wider range of networks in comparison to FSP. Indeed with the QTT implementation, finding stationary distributions for problems with state truncations exceeding 100 million states was shown to be feasible. A possible approach for enhancing the feasibility of sFSP to even larger problems would be to integrate it with sparse grids and aggregation methods [8]. Note that at the core of sFSP, is the problem of finding vectors in the one-dimensional null-spaces of large, but extremely sparse matrices (see Section 4.2). This sparsity and the structure of the matrices that arise, make this problem quite amenable to parallel-computing approaches [39].
Acknowledgments
The authors would like to thank Prof. Sean Meyn (University of Florida) and Prof. Brian Munsky (Colorado State University) for their helpful comments and suggestions.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Harley H. Mc Adams and Adam Arkin. Stochastic mechanisms in gene expression. Proc. Natl. Acad. Sci., Biochemistry , 94:814–819, 1997.
- 2[2] Michael B. Elowitz, Arnold J. Levine, Eric D. Siggia, and Peter S. Swain. Stochastic gene expression in a single cell. Science , 297(5584):1183–1186, 2002.
- 3[3] D.A. Anderson and T.G. Kurtz. Continuous time Markov chain models for chemical reaction networks. In H. Koeppl, G. Setti, M. di Bernardo, and D. Densmore, editors, Design and Analysis of Biomolecular Circuits . Springer-Verlag, 2011.
- 4[4] Daniel T. Gillespie. Exact stochastic simulation of coupled chemical reactions. The Journal of Physical Chemistry , 81(25):2340–2361, 1977.
- 5[5] B. Munsky and M. Khammash. The finite state projection algorithm for the solution of the chemical master equation. Journal of Chemical Physics , 124(4), 2006.
- 6[6] Shev Mac Namara, Kevin Burrage, and Roger B Sidje. Multiscale modeling of chemical kinetics via the master equation. Multiscale Modeling & Simulation , 6(4):1146–1168, 2008.
- 7[7] Vladimir Kazeev, Mustafa Khammash, Michael Nip, and Christoph Schwab. Direct solution of the chemical master equation using quantized tensor trains. P Lo S Comput Biol , 10(3):e 1003359, 03 2014.
- 8[8] Markus Hegland, Andreas Hellander, and Per Lötstedt. Sparse grids and hybrid methods for the chemical master equation. BIT Numerical Mathematics , 48(2):265, 2008.
