Parallel replica dynamics method for bistable stochastic reaction networks: simulation and sensitivity analysis
Ting Wang, Petr Plech\'a\v{c}

TL;DR
This paper introduces a parallel replica method to efficiently sample the stationary distribution of bistable stochastic reaction networks, enabling better understanding of their long-term behavior and sensitivity analysis.
Contribution
The paper presents a novel application of the parallel replica method to stochastic reaction networks, improving sampling efficiency and integrating sensitivity analysis.
Findings
Efficient sampling of rare transitions in bistable networks.
Accurate sensitivity analysis using combined ParRep and path space bounds.
Validated method on Schl"{o}gl model and genetic switches network.
Abstract
Stochastic reaction networks that exhibit bistability are common in many fields such as systems biology and materials science. Sampling of the stationary distribution is crucial for understanding and characterizing the long term dynamics of bistable stochastic dynamical systems. However, this is normally hindered by the insufficient sampling of the rare transitions between the two metastable regions. In this paper, we apply the parallel replica (ParRep) method for continuous time Markov chain to accelerate the stationary distribution sampling of bistable stochastic reaction networks. The proposed method uses parallel computing to accelerate the sampling of rare transitions and it is very easy to implement. We combine ParRep with the path space information bounds for parametric sensitivity analysis. We demonstrate the efficiency and accuracy of the method by studying the Schl\"{o}gl…
| Reaction | Propensity Function | Stoich. Vec. |
|---|---|---|
| Matrix Element | Estimated pFIM | Half width C.I. |
|---|---|---|
| 8.75E+01 | 3.02E-01 | |
| 1.67E+03 | 5.66E+00 | |
| 2.00E+02 | 2.59E-06 | |
| 2.46E+01 | 7.88E-02 |
| Parameter | ||||
|---|---|---|---|---|
| Bounds | 7.16E+03 | 3.09E+04 | 1.08E+04 | 3.80+E03 |
| CME Approx. | 4.07E+02 | 9.10E+02 | 6.30E+02 | -2.65+E02 |
| Reaction | Propensity Function | Stoich. Vec. |
|---|---|---|
| Matrix Element | Estimated pFIM | Half width C.I. |
|---|---|---|
| 3.34E-03 | 2.35E-05 | |
| 1.69E+00 | 1.19E-02 | |
| 3.57E-01 | 2.52E-03 | |
| 4.34E-01 | 2.83E-03 | |
| 8.32E-04 | 5.88E-06 | |
| 8.33E-03 | 6.12E-05 | |
| 8.44E-04 | 5.23E-06 | |
| 2.22E-05 | 1.56E-07 |
| active DNA | inactive DNA | mRNA | Protein |
|---|---|---|---|
| 1.64E+02 | 1.64E+02 | 7.47E+02 | 9.45E+08 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Parallel replica dynamics method for bistable stochastic reaction networks: simulation and sensitivity analysis
Ting Wang
Petr Plecháč
Department of Mathematical Sciences, University of Delaware, Delaware 19716 USA
Abstract
Stochastic reaction networks that exhibit bi-stable behavior are common in many fields such as systems biology and materials science. Sampling of the stationary distribution is crucial for understanding and characterizing the long term dynamics of bistable stochastic dynamical systems. However, this is normally hindered by the insufficient sampling of the rare transitions between the two metastable regions. In this paper, we apply the parallel replica (ParRep) method for continuous time Markov chain ParRep-CTMC to accelerate the stationary distribution sampling of bistable stochastic reaction networks. The proposed method uses parallel computing to accelerate the sampling of rare transitions and it is very easy to implement. We combine ParRep with the path space information bounds dupuis2016path for parametric sensitivity analysis. We demonstrate the efficiency and accuracy of the method by studying the Schlögl model and the genetic switch network.
I Introduction
Stochastic reaction networks have become increasingly important as a tool for modeling complex biological and chemical systems with random noises mcadams1999sa . Simulation of real-world reaction networks using the stochastic simulation algorithm (SSA) gillespie-SSA can be computationally intractable due to the multiscale feature of the systems. For instance, reaction networks in biological cells often involve vastly different numbers of molecules for different species and rate constants for different reaction channelskang2013separation . Therefore, the system is metastable in the sense that the SSA rarely samples the reactions involving small rate constants or low population species. Our paper addresses with another type of metastable issue associated with reaction networks. We consider metastablity that is caused by extremely rare transitions between two separate regions of the state space, i.e., the bistable reaction networks. It has been discovered recently that many biological and physical systems exhibit bistability and hence it is of great interests to understand the bistable phenomenon. mehta2008exponential ; gardner2000construction ; umulis2006robust ; angeli2004detection
We study two important aspects regarding bistable reaction networks: accelerated stationary distribution sampling and parametric uncertainty quantification, dupuis2016path ; pantazis2013relative ; arampatzis2015accelerated using the parallel replica dynamics (ParRep) method. voter1998parallel ; ParRep-SDE ; ParRep-Chain ; ParRep-CTMC We know that SSA based sampling for bistable reaction networks can be extremely expensive because of the rare sampling of transitions between two metastable regions. As a remedy for this issue, the ParRep uses multiple parallel replicas to explore the transition path between the two metastable regions with a controllable error. The method was originally designed for sampling rare events in molecular dynamics simulation such as Langevin dynamics. voter1998parallel The mathematical framework of ParRep was recently developed for the discrete time Markov chains (DTMC).ParRep-SDE ; binder2015generalized ; ParRep-Chain ; aristoff2015parallel In this paper we apply the version of ParRep algorithm that we developed for continuous time Markov chains (CTMC) ParRep-CTMC to accelerate the simulation of bistable reaction networks. Furthermore, the algorithm allows us to efficiently sample the stationary distribution starting from the transient regime. We also investigate the parametric sensitivity problem of bistable reaction networks. Basically, we study the change of bistable system outputs to perturbations in system parameters. This enables us to quantify the parametric uncertainty and system robustness. We point out that the proposed version of ParRep can be easily combined with the path space information bounds dupuis2016path to provide useful information and reductions for parametric sensitivity analysis in high dimensions.
I.1 Stochastic reaction network model
We consider a well-mixed chemical system with species interacting through reaction channels with system size . Under the well-mixed assumption, the molecular population is modeled as an dimensional CTMC . The numbers of molecule of the th species consumed and produced in the th reaction are denoted by and , respectively. We call the net change caused by the th reaction the stoichiometric vector, which is independent of the system size . Each reaction channel is associated with a propensity function such that given , the probability of the th reaction occurs at the infinitesimal time interval is , where is the vector of rate constants in . In this paper, we will suppress when we write the propensity functions unless we study the parametric sensitivity with respect to . From the propensity functions, we can construct the transition rate matrix (or the infinitesimal generator) of the Markov chain such that
[TABLE]
Moreover, it is well known that the time evolution of is characterized by the random time change representationethier2009markov
[TABLE]
where are independent unit rate Poisson processes.
For a fixed system of the size , the probability distribution of the population process is completely governed by the chemical master equation (CME)
[TABLE]
where . In principle, the CME enables the computation of the distribution of for any . However, the CME is normally an infinite dimensional system which cannot be solved explicitly in general. Therefore, Monte Carlo methods such as Gillespie’s SSA are commonly used to obtain the numerical solution to the CME.
We denote by the corresponding concentration process for a system with size . When is large, the randomness of the reaction network can be neglected and can be approximated by the solution of the reaction rate equation (RRE) kurtz1970solutions
[TABLE]
in the sense that
[TABLE]
almost surely for any , where we assume for all . Throughout this paper, we assume such form for propensity functions and hence the random time change representation for the concentration process is
[TABLE]
We focus on reaction networks that are modeled by an ergodic CTMC such that the stationary distribution exists and the ergodic limit
[TABLE]
holds for suitable observables . Here the stationary distribution depends on since the process depends on . The gradient of with respect to the parameter , i.e., , serves as an indicator for the system’s parametric uncertainty or robustness. We call the estimation of
[TABLE]
the stationary sensitivity analysis problem, where the superscript “tr” means transpose.
I.2 Reaction networks with bistability
In this paper, we are mainly interested in accelerating simulation and sensitivity analysis for bistable reaction networks, i.e., reaction networks whose RRE has a pair of asymptotically stable fixed points and separated by a saddle point . We denote the neighborhood of by and the neighborhood of by . If we neglect the randomness of the network, any initial point that is placed in (resp. ) approaches to (resp. ) eventually. However, due to the random noise (since is finite), the system is subject to rare, large fluctuations which make the concentration process to be far away from one stable fixed point and enter into the neighborhood of the other stable point. The theoretical tool to study this type of large fluctuations is the large deviation principle (LDP) LDP-Dembo-Zeitouni ; LDP-Shwartz-Weiss ; LDP-Dupuis-Ellis . The key ingredient in the LDP of is the rate function (or action) which characterizes the exponentially small probability for remaining in a small neighborhood of a path , i.e.,
[TABLE]
for all small when is large. By minimizing the rate function over the path space one can find the so called most probable path dykman1994large . In a bistable reaction network, sojourns in (resp. ) for long time until there is an exponentially small probability for it to leave (resp. ) along the most probable path. In this sense, we call and metastable sets for since the sojourn times in both sets are exponentially long. The metastability issue normally leads to insufficient sampling of transition events between and and consequently makes it computationally prohibitive to sample the stationary distribution for . In this work we aim to speed up the sampling of by accelerating the exit from metastable sets using parallel computing.
II methodology
II.1 Parallel replica dynamics
The idea of ParRep was first introduced for simulating rare events voter1998parallel and was recently formalized in several papersParRep-SDE ; ParRep-Chain ; ParRep-CTMC . Our goal in this section is to introduce the ParRep method ParRep-CTMC to accelerate the simulation of bistable stochastic reaction networks and estimate the stationary distribution. Since we are considering fixed volume in this section, we will suppress the superscript to simplify the notations.
The theoretical justification for ParRep relies on the notion of the quasi-stationary distribution (QSD). Given a set and a DTMC , a distribution is called the quasi-stationary distribution of in if
[TABLE]
for all and any measurable set , where is the first exit time of from . The definition roughly says that the QSD is a distribution supported on such that if the initial distribution is , then the DTMC remains distributed with before it exits . The existence and uniqueness of the QSD in this setting can be shown rigorously ParRep-CTMC . The consequence of assuming that starts at the QSD in is that the first exit time follows a geometric distribution with some parameter , i.e., for all . Moreover, the first exit time and the exit state are independentQSD .
Now suppose we have independent and identically distributed replicas of , each with initial distribution QSD . Denote the first exit time of the th replica by and define the smallest first exit time among all replicas by
[TABLE]
Note that there could be more than one replicas which realize the (exit after the same number of steps), we denote by the smallest index among the exited replicas, i.e.,
[TABLE]
Assuming each of the replicas of is initially distributed with the QSD , the following two results are crucial for the design of the ParRep algorithmParRep-Chain .
is independent of ; 2. 2.
has the same distribution as .
The first result states that the first exit state from over replicas is independent with the total sojourn time over replicas. Furthermore, the second result guarantees that joint distribution of the first exit time and the first exit state is independent of the number of replicas. These facts suggest that we can use multiple replicas to explore a metastable region in order to accelerate the sampling of exit events but without changing the exit distribution. That is, we can achieve acceleration by using parallel computing. However, the gain of efficiency in this procedure is under the assumption that all replicas start with the QSD of , which is not the case in general. In order to sample the QSD for launching the parallel step, some preparation steps are needed to make the process to be well into the quasi-stationary state. Therefore, a complete cycle of ParRep can be roughly divided into three steps,
- S1
Decorrelation: simulate until the QSD of the current metastable set is sampled. Proceed to the dephasing step;
- S2
Dephasing: prepare a sequence of iid initial state from . Proceed to the parallel step;
- S3
Parallel: launch replicas of at to explore the exit path from . Return to the decorrelation step.
We can adapt the above ParRep procedure for DTMC to the simulation of CTMC through simulating its embedded chain. More significantly, the algorithm can be modified to effectively sample the stationary distribution of a CTMC without the detailed balance assumption. We present the ParRep algorithm for CTMC in Algorithm 1. The setup of notations in the ParRep algorithm is as follows.
- : ParRep process we simulate throughout the ParRep algorithm;
- : time clock throughout the ParRep algorithm;
- : accumulated contribution to the time integral throughout the ParRep algorithm;
- : count of transitions in each decorrelation step;
- : decorrelation threshold;
- : count of transitions in each dephasing step;
- : dephasing threshold;
- : holding time for the next reaction;
- : index of the next reaction;
Before we start the ParRep algorithm, we choose fixed decorrelation threshold and dephasing threshold and initialize and .
The procedure of the decorrelation step can be summarized as follows. If is not a metastable set, then the process would leave rapidly and hence there is no need to launch the following dephasing and parallel steps. However, if is metastable then the process would remain in for at least transitions and the algorithm proceed to the dephasing step. Since we assume is large enough for the process to reach the QSD of , the state we obtain after transitions is asymptotically distributed according to the QSD. The dynamics in the decorrelation step is exact and hence there is no loss of accuracy and no acceleration either during this step. In the dephasing step, we apply the Fleming-Viot particle technique binder2015generalized to sample a sequence of iid initial states that can be used in the subsequent parallel step. Similar to the decorrelation step, we specify the dephasing threshold and let all replicas to evolve for transitions (jumps). During this procedure, if a replica leaves then we force it to restart from the current state of another replica (chosen uniformly). Similar to , is large enough so that we sample a sequence of QSD distributed states . Note that the dephasing step does not contribute anything to the , and , its only purpose is to prepare the initial states for the subsequent parallel step.
The acceleration of ParRep comes from the parallel step. We launch parallel replicas from to explore the exit event from , that is, sample , and the first exit state . Since has the same distribution as , sampling of exit events with replicas (i.e., sample and ) in the parallel step is approximately times faster than that with serial simulation (i.e., sample ). Moreover, all the generated data from each replica in the parallel step are collected in order to sample the stationary distribution . This is through the update of the clock time and the time integral . Note that sampling by reusing these generated data from ParRep is statistically correct (asymptotically when and are large) comparing to that from the serial simulation. In fact, we have shown that the averaged contribution to or over each ParRep cycle is independent with the replica number , provided that are independent and distributed according to the QSD ParRep-CTMC ; aristoff2015parallel .
II.2 Accuracy and efficiency
The accuracy of ParRep method relies on the choice of the decorrelation step and the dephasing step since these parameters determine how “good” we sample the QSD before the parallel step. In practice, we would never have exact sampling of the QSD at each ParRep cycle and hence there is an error associated with the inexact sampling of QSD. However, for large and we can expect that the error is sufficiently small. In fact, this can be justified by the following resultParRep-CTMC . For fixed , we define the distribution as
[TABLE]
for any measurable set , i.e., the distribution of conditioned on no exit event occurred after transition steps. If we assume that the dephasing step is exact (i.e., are independent and distributed as the QSD), then the averaged error for sampling over each ParRep cycle can be bounded by a constant times the total variation . Furthermore, the total variation converges geometrically fast in terms of . This justifies that the dynamics of transition from one metastable set to another metastable set (i.e., one ParRep cycle) is asymptotically correct. The analysis of the global error from all ParRep cycles is hard to analyze. However, our numerical experiments in Sec. IV suggest that ParRep is a rather accurate algorithm for long time simulation.
We briefly discuss the efficiency of ParRep for CTMC. In this paper, we define the speedup as the ratio between the total computational time of serial simulation and that of ParRep simulation. In the idealized scenario, the speedup factor of ParRep could be up to the number of replica used in the simulation as suggested by the properties. However, in practice the preparation for a sequence of QSD initial states offsets this linear acceleration. Heuristically, the efficiency of ParRep relies on the metastability of the set. If the set is strongly metastable, then the time spent in the decorrelation and dephasing steps is negligible comparing to the acceleration achieved in the parallel step. However, if the set is not truly metastable then the parallel step would not be activated and hence the ParRep is equivalent to SSA. In fact, this argument can be formalized and it turns out that the efficiency of ParRep is determined by the ratio , where and (with ) are the two largest eigenvalues of the transition rate matrix (see (1) for definition) restricted to the metastable set . We do not pursue this aspect rigorously in this paper. Interested reader could refer to the related literature binder2015generalized .
III Path space information bound
In this section, we combine the ParRep method with the path-space information bounds dupuis2016path to accelerate the parametric sensitivity analysis of stochastic reaction networks. The bounds are derived using several concepts in information theory. For the readability of the paper, we briefly review these concepts and their connections in Appendix A.
Recall that we define the sensitivity analysis problem at the end of Section I.1. There exist several types of sensitivity analysis methods such as the finite difference rathinam2010efficient ; anderson2012efficient , likelihood ration plyasunov2007efficient and infinitesimal perturbation analysis or pathwise derivative method sheppard2012pathwise . We refer them as the direct methods since they aim to estimate the sensitivity directly. However, direct estimation of the sensitivity can be extremely expensive due to their large variances wang2016efficiency and complexity when applied to large reactions networks. Alternatively, we aim to compute a gradient-free upper bound of the sensitivity. The computed sensitivity bounds can be used for screening out those insensitive parameters (with small bounds) and then direct methods can be applied for the remaining of parameters arampatzis2015accelerated .
In general, given a probability distribution which depends on a vector of parameters , we define the sensitivity index of an observable (along the direction ) as
[TABLE]
Note that in the case that and (the -th basis vector), the sensitivity index is simply the -th component of the gradient . When we are interested in the sensitivity analysis of the stochastic process with stationary distribution , it is often convenient to interpret the distribution as the path space distribution , i.e., the probability distribution of paths of on the time interval . It can be shown that in the transient regime (i.e., the initial distribution of is not ) the sensitivity index can be bounded by
[TABLE]
where is the path space Fisher information matrix (FIM) of the relative entropy (see Appendix B for a formal derivation). In the stationary regime, a similar sensitivity bound can be derived. That is, the stationary sensitivity index can be bounded by
[TABLE]
where is the integrated auto-correlation function (IAF) and is the path space FIM of the relative entropy rate . In fact, can be roughly interpreted as . See Appendix B for precise definitions and a formal derivation of the bounds (8) and (9).
We focus on bounding the stationary sensitivity in the context of stochastic reaction networks, i.e., is a continuous time jump Markov process. To make use of the bounds (9), we need reliable estimators for the IAF and the path space FIM . For the IAF, we have shown in the Appendix B that
[TABLE]
Hence, when is large, an approximate estimator for the IAF is
[TABLE]
where is the sample size, is the -th sample and is the sample average. Note that (9) assumes the dynamics starts at the stationary regime, hence a burn-in period is necessary for the dynamics to relax to the stationary state before we start sampling the IAF. Now for the path space FIM, it can be written as the stationary expectation of a special observable in terms of the propensity functions (see Appendix A), i.e.,
[TABLE]
Since the expectation is taken under the stationary distribution, the path space FIM can be approximated as the ergodic average of the observable. That is,
[TABLE]
Hence, an estimator for the path space FIM is simply
[TABLE]
where is the -th realization of the ergodic average. Note that the FIM is of great interests by itself since it reflects the identifiability of parameters by Cramér-Rao’s inequality. We will use the path space information bounds to estimate the stationary sensitivity bounds for numerical experiments in the next section.
IV Numerical examples
In this section, we consider two bistable examples arising in chemistry and systems biology. We demonstrate that the ParRep algorithm can efficiently sample rare transitions between two stable equilibrium points and outperforms the standard SSA by a significant speedup factor.
IV.1 Bistable Schlögl model
IV.1.1 Model
The Schlögl model is one of the simplest example of stochastic reaction networks that exhibit bistability. It is an auto-catalytic network involving three species whose population can change according to the reaction network in Table 1. Following our notational convention, we denote by the concentration of the species and the population of . The concentration of and (denoted by and , respectively) are fixed due to an exchange of chemicals between two material baths vellela2009stochastic and hence and are considered as parameters of the network. Therefore, it is equivalent to the Schlöglmodel as a one species network
[TABLE]
with and absorbed in the rate constants and . In this paper, we follow the chemical convention to write the reactions of Schlögl network as in Table 1.
In the large volume limit, the concentration process has a deterministic limit satisfying the RRE (4)
[TABLE]
We choose , , , , and cao2013adaptively , in which case the RRE has two stable equilibrium points and separated by an unstable equilibrium point . Therefore, Schlögl model exhibits two time scales: the fast time scale corresponds to the relaxation to one of the stable equilibrium points and the slow time scale corresponds to the rare transitions between the two stable equilibrium points. The two-time scale feature is illustrated in Figure 1, where the standard SSA is performed with .
Due to the bistable nature, long time simulation is needed to sample enough transition events so that the system relaxes to stationary distribution. We apply the ParRep algorithm to accelerate the sampling of very long trajectories in order to estimate the stationary distribution . We decompose the state space into two metastable sets separated by the unstable equilibrium state (we multiply the concentration by so that all the comparisons are in terms of the population instead of the concentration). That is,
[TABLE]
where is the state space of . Note that this decomposition will be optimal for ParRep in terms of efficiency since both and will be strongly metastable. This can be seen by contradiction. In fact, if the decomposition is defined in terms of a point which is left to (), then every time exits from (with first exit state in the interval ) will be quickly pulled back to the left stable point with a dominating probability, by the large deviation principle. Hence, the subinterval in is not metastable and the ParRep will be inefficient since the parallel step is not activated when the process is in this interval. Therefore, the optimal choices for separatrix is the point which guarantees that both of the decomposed sets are truly metastable.
IV.1.2 Results and discussion
Figure 2 shows the estimates of the stationary average of (ergodic average at ) with SSA (blue dashed line) and with the ParRep algorithm (red dot with confidence interval) for different choice of decorrelation and dephasing steps. The number of replicas for ParRep is . We also plot the numerical approximation of CME as a benchmark (green solid line) for accuracy in Figure 2. It can be seen that the ParRep simulation approximates the stationary average very well (relative error with respect to the CME solution is for ) when the decorrelation and dephasing steps are large, this is consistent with our expectation that the QSD of each metastable set is well approximated for large and . All simulation results are obtained based on sample trajectories. The CPU time of standard SSA simulation is about hours for samples. We demonstrate the corresponding speedup factor with to (smaller values are ignored as the corresponding estimates are not accurate enough). We can see that with replicas, our ParRep outperforms the standard SSA by a significant speedup factor.
We also study the efficiency of ParRep in terms of the number of replicas. In Figure 3 we show the estimation of the stationary average of and the corresponding speedup factor. The decorreation and dephasing steps are fixed at . We observe that the speedup factor increases from to when the number of replicas changes from to . However, the accuracy of ParRep is independent of the number of replicas. In Figure 4 we demonstrate the application of ParRep to estimate the probability distribution of with . The estimated probability distribution (blue bar) is compared with the probability distribution obtained from CME approximation. The plot suggests that ParRep is a rather accurate method when suitable and are chosen.
Finally, we apply the path space information bound (9) to obtain a bound for the sensitivity index . Here we only consider the stationary sensitivity of the observable with respect to each parameter , . Note that the stochastic reaction networks we simulate start at the transient regime, i.e., the initial distribution of is not necessarily . However, the path space information bounds (9) assume that starts in the stationary regime. Therefore, a burn-in period is needed for the process to be well into the stationary regime before we can start sampling the IAF and the path space FIM . We choose the terminal time and use the first half as the burn-in period to prepare the stationary distribution and the second half to sample the IAF and path space FIM. The computed path space FIM and the confidence intervals are shown in Table 2. Note that the pFIM is not only useful for obtaining the final sensitivity bounds, but also implies the identifiability of parameters by the Cramér-Rao bound. The computed IAF is +. The resulting sensitivity bounds are shown in Table 3. To see whether the obtained sensitivity bounds are tight enough, we compare them with the approximated sensitivities. The approximation is obtained by differentiating the CME (3) at steady state and truncating the state space to . The resulting equation is a linear system that can be solved numerically. Comparing the sensitivity bounds with the approximated sensitivities, we observe that the bounds are not tight enough in this example. In fact, it has been observed in several examples that the path space information bounds are not always tight when applied to multi-scale problems. Nevertheless, the bounds are quite useful for screening insensitive parameters in large scale stochastic dynamical systems. We will demonstrate this application of the bounds in the next example.
IV.2 Genetic switch with positive feedback
IV.2.1 Model
Another example we study in this paper is the genetic switch network which is the fundamental mechanism for cells to shift between alternate gene-expression states. See Figure 5 for the diagram of the network. In the genetic switch network, there is an on-off switch for DNA to be in the active or inactive state. Hence the total population of active DNA and inactive DNA is . The transition rates (inactive to active) and (active to inactive) between these two states depend on the number of proteins through a positive feedback. Following Assaf, Roberts and Luthey-Schulten (2012)assaf2011determining , we explicitly take the mRNA noise into account since it has been shown that the presence of mRNA has a significant impact on the dynamics of the network mehta2008exponential . We list the propensity function and stochiometric vector of each reaction channel in Table 4.
We fix large volume throughout this example. The two-dimensional process denotes the number of mRNA and protein at time . We denote the number of active DNA by the process and hence the number of inactive DNA is . The transition rates are taken to be of the Hill-type functions for the inactive to active transition and for the reverse transition, where
[TABLE]
Throughout this example, we follow Assaf et al. assaf2011determining to set the parameters as follows: , , , , and .
We point out that the genetic switch model does not fall into the standard framework of stochastic reaction networks we describe in Sec I.1. In fact, the random time change representation of can be written as
[TABLE]
See Table 4 for the four reactions involved in this representation. Note that the propensity functions are functions of both the switching variable and the population process of mRNA and protein . Since does not depend on volume , the process does not satisfy the large volume limit (5). However, the mean numbers of mRNA and protein still satisfy the following rescaled RRE (i.e., the ODE governing ) lv2014constructing ; li2015large
[TABLE]
where the factor gives the probability that the DNA is in an active state. With our choice of parameters, (12) has two stable equilibrium points and separated by a saddle point . Therefore, the genetic switching network is bistable. When is finite, there are noise induced rare transitions between and .
To find the optimal decomposition of the state space into two metastable sets, we need to analyze the phase space of (12) to determine the separatrix of the two metastable regions induced by and . Unlike the Schlögl model , it is nontrivial to find the separatrix in this example since it is in . Instead, the way we detect rare transitions is ad-hoc. We simply choose the horizontal line that passes through the saddle point as the boundary to define the two metastable sets. We provide a heuristic explanation for the choice. From the large deviation perspective, there exists a most probable transition path from to li2015large ; lv2014constructing ; dykman1994large such that if a transition occurs, then with a dominating probability, the transition would move along this path. We know that the true separatrix passes through the saddle point and that the most probable transition path crosses the true separatrix along a path that is “sufficiently close” to the saddle point (lv2014constructing ). Since the points , , and are the only possible states that are sufficiently close to the saddle point , the most probable transition path can only cross the separatrix (from to ) by moving from to or to depending on where the true separatrix lies. Accordingly, this suggests that we can use either (if the most probable path moves from to ) or (if the most probable path moves from to ) as the boundary to decompose the state space into two metastable sets. It is readily seen that we should choose the horizontal one since the process is much more metastable in the direction than that in the direction. Therefore, we decompose the state space into two sets
[TABLE]
and
[TABLE]
Our simulation results confirm that this is a good choice of decomposition. Note that though the choice of decomposition may not be optimal since we do not know the true separatrix a priori, it only affects the efficiency but not the accuracy of ParRep as we discuss in Section II.2. A rigorous approach to defining the optimal decomposition into metastable sets is the subject of ongoing work.
IV.2.2 Results and discussion
Throughout the simulation of the genetic switch network, all simulation results are obtained based on sample trajectories. The initial population is molecule for inactive DNA and [math] molecule for all the remaining species. The terminal time is taken to be which is sufficiently large for sampling the ergodic average. We first study the accuracy of ParRep in terms of the decorrelation step and dephasing step with replicas. Figure 6 demonstrates the simulation results regarding the stationary means of mRNA and protein when increase. Simulation results with SSA (blue dashed line) are used for accuracy comparison. The corresponding speedup factor is shown in the same plot (lower panel). The plot suggests that with or above, the ParRep is as accurate as the standard SSA. Figure 7 shows the speedup of ParRep when we vary the number of replicas with . We do not gain as much speedup as in the Schlögl model since the genetic switch network requires much larger and to converge to the QSD at each metastable set (as we observed in Figure 6). Nevertheless, we can see that with replicas, the speedup factor of ParRep is about when compared to SSA.
We also study the parametric sensitivity of the genetic switch model by quantifying the stationary path space sensitivity bounds (9). The observables in considerations are the number of active DNA (), inactive DNA (), mRNA () and protein (). The parameters are arranged in the order . We aim to apply ParRep to estimate the stationary sensitivity bounds of each observable with respect to each of the parameters. In order to obtain the bounds, we simulate the process up to final time . The time interval corresponds to the transient regime and corresponds to the stationary regime. The estimated (diagonal) path space FIM along with the confidence interval are shown in Table 5. The estimated IAF for each observable is shown in Table 6. Finally, we combine the path space FIM and IAF to obtain the stationary sensitivity bounds. To illustrate the observation that most of the sensitivity indices are small, we visualize the sensitivity bounds in Figure 8. We see that the the active DNA, inactive DNA and mRNA are insensitive to parametric perturbation, whereas the protein tends to be sensitive. If we are interested in quantifying the parametric uncertainty for the genetic switch model, these sensitivity bounds suggest that we can screen out those insensitive combinations and apply direct methods to estimate the remaining sensitivity such as the number of protein with respect to , and . Note that without this bounds, we have to estimate sensitivities even when we do not take other observables into consideration. However, with the sensitivity bounds for screening, we only need to estimate much fewer sensitivities depending on the controlled confidence level we use. Therefore, this two-step strategy significantly reduces the computational cost especially when applied to large scale networks.
Appendix A Basics for information theory
We review some basic information theory concepts for the completeness of the paper. In particular, we briefly reproduce the formula for the relative entropy and the path space FIM in the context of stochastic reaction networks (i.e., continuous time jump Markov processes) pantazis2013relative .
Given the path space probability distribution and its perturbation on the path space , their pseudo-distance can be measured by the relative entropy
[TABLE]
In particular, the Radon-Nikodym derivative follows from the following Girsanov formula bremaud1981point
[TABLE]
where is the count of the -th reaction up to time . Assuming the dynamics starts from the stationary regime (i.e., ) and using the facts that is a martingale under , the relative entropy can be simplify as
[TABLE]
where
[TABLE]
is the path space relative entropy rate (RER). Note that when , the Taylor expansion of gives
[TABLE]
where
[TABLE]
is the path space Fisher information matrix (FIM) of RER .
Appendix B Path space information bounds: a formal derivation
For completeness of the paper, we give a formal derivation of the path space information bounds on both the transient regime and the stationary regime, see the reference dupuis2016path for rigorous derivation. We consider a continuous time Markov process with stationary distribution . For the path space measure , we assume that it is absolutely continuous with respect to a reference measure such that for any . Then by the definition of sensitivity indices and the Cauchy-Schwarz inequality,
[TABLE]
where is the FIM of the relative entropy . This gives the path space information bounds on the transient regime.
In the stationary regime, we focus on the ergodic average type observable . Since the stationary distribution is also the initial distribution of the stochastic process , it holds that . Hence by the path space information bounds for ,
[TABLE]
Taking ,
[TABLE]
where we assumed that converges to the path space FIM . Note that
[TABLE]
gives the integrated auto-correlation function (IAF) for observable , where
[TABLE]
is the covariance between and . We denote that IAF by
[TABLE]
We remark that the IAF only differs with the integrated auto-correlation time (IAT) by a multiplying factor , i.e., . Finally, we have the stationary path space information bounds
[TABLE]
Acknowledgements.
The work of P.P. has been partially supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, Applied Mathematics program under the contract number DE-SC0010549. The work of T.W. was partially supported by the DARPA project W911NF-15-2-0122. We thank Professor Tiejun Li for discussions about the genetic switches example.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] D. F. Anderson. An efficient finite difference method for parameter sensitivities of continuous time markov chains. SIAM Journal on Numerical Analysis , 50(5):2237–2258, 2012.
- 2[2] D. Angeli, J. E. Ferrell, and E. D. Sontag. Detection of multistability, bifurcations, and hysteresis in a large class of biological positive-feedback systems. Proceedings of the National Academy of Sciences , 101(7):1822–1827, 2004.
- 3[3] G. Arampatzis, M. A. Katsoulakis, and Y. Pantazis. Accelerated sensitivity analysis in high-dimensional stochastic reaction networks. Plo S one , 10(7):e 0130825, 2015.
- 4[4] D. Aristoff. The parallel replica method for computing equilibrium averages of markov chains. Monte Carlo Methods and Applications , 21(4):255–273, 2015.
- 5[5] D. Aristoff, T. Lelièvre, and G. Simpson. The parallel replica method for simulating long trajectories of markov chains. Appl. Math. Res. Express. , 2014(2):332–352, 2014.
- 6[6] M. Assaf, E. Roberts, and Z. Luthey-Schulten. Determining the stability of genetic switches: explicitly accounting for mrna noise. Phys. Rev. Lett. , 106(24):248102, 2011.
- 7[7] A. Binder, T. Lelièvre, and G. Simpson. A generalized parallel replica dynamics. J. Comput. Phys. , 284:595–616, 2015.
- 8[8] P. Brémaud. Point processes and queues: martingale dynamics. 1981.
