Estimation of parameter sensitivities for stochastic reaction networks using tau-leap simulations
Ankit Gupta, Muruhan Rathinam, Mustafa Khammash

TL;DR
This paper introduces a new, efficient method for estimating parameter sensitivities in stochastic reaction networks using tau-leap simulations, reducing computational cost while maintaining accuracy.
Contribution
The authors develop a novel integral representation for sensitivity estimation that can be approximated by any tau-leap method, improving efficiency over existing techniques.
Findings
The method is easy to implement and compatible with any tau-leap scheme.
It achieves similar accuracy to the underlying tau-leap method.
Numerical examples demonstrate significant efficiency gains.
Abstract
We consider the important problem of estimating parameter sensitivities for stochastic models of reaction networks that describe the dynamics as a continuous-time Markov process over a discrete lattice. These sensitivity values are useful for understanding network properties, validating their design and identifying the pivotal model parameters. Many methods for sensitivity estimation have been developed, but their computational feasibility suffers from the critical bottleneck of requiring time-consuming Monte Carlo simulations of the exact reaction dynamics. To circumvent this problem one needs to devise methods that speed up the computations while suffering acceptable and quantifiable loss of accuracy. We develop such a method by first deriving a novel integral representation of parameter sensitivity and then demonstrating that this integral may be approximated by any convergent…
| Type | Method | Trade-off | Trade-off | Preserved |
| quantities | parameter | quantity | ||
| Biased | CRP | |||
| CFD | ||||
| Unbiased | APA | |||
| PPA |
| No. | Reaction | Propensity |
|---|---|---|
| 1 | ||
| 2 | ||
| 3 | ||
| 4 | ||
| 5 | ||
| 6 | ||
| 7 | ||
| 8 | ||
| 9 | ||
| 10 | ||
| 11 | ||
| 12 |
| eIPA | IPA | |||||||
|---|---|---|---|---|---|---|---|---|
| Mean | Std Dev | RE | RSDCC | Mean | Std Dev | RE | RSDCC | |
| 5 | -90.079 | 0.093 | 0.139 | 0.379E-5 | -90.938 | 0.078 | 0.813 | 0.121E-5 |
| 10 | -264.5 | 0.309 | 0.099 | 0.97E-5 | -266.34 | 0.243 | 0.793 | 0.247E-5 |
| eCFD | CFD | |||||||
| Mean | Std Dev | RE | RSDCC | Mean | Std Dev | RE | RSDCC | |
| 5 | -90.632 | 0.088 | 0.4746 | 0.078E-5 | -86.456 | 0.089 | 4.155 | 0.033E-5 |
| 10 | -268.77 | 0.142 | 1.716 | 0.054E-5 | -268.214 | 0.146 | 1.503 | 0.021E-5 |
| eCRP | CRP | |||||||
| Mean | Std Dev | RE | RSDCC | Mean | Std Dev | RE | RSDCC | |
| 5 | -90.749 | 0.097 | 0.604 | 0.343E-5 | -86.481 | 0.098 | 4.128 | 0.343E-5 |
| 10 | -268.82 | 0.169 | 1.734 | 0.152E-5 | -267.92 | 0.173 | 1.393 | 0.131E-5 |
| eIPA | IPA | |||||||
|---|---|---|---|---|---|---|---|---|
| Mean | Std Dev | RE | RSDCC | Mean | Std Dev | RE | RSDCC | |
| 1.202 | 0.0107 | 0.625 | 0.0046 | 1.185 | 0.0131 | 0.822 | 0.0023 | |
| -2.133 | 0.0132 | 0.663 | 0.0021 | -2.3968 | 0.0148 | 13.087 | 0.0008 | |
| -5.924 | 0.0419 | 1.144 | 0.0020 | -8.5372 | 0.0562 | 42.456 | 0.0008 | |
| 54.372 | 0.1679 | 0.367 | 0.0009 | 60.156 | 0.191 | 10.232 | 0.0003 | |
| eCFD | CFD | |||||||
| Mean | Std Dev | RE | RSDCC | Mean | Std Dev | RE | RSDCC | |
| 1.053 | 0.11 | 11.883 | 0.1925 | 1.183 | 0.0491 | 1.021 | 0.0088 | |
| -2.007 | 0.267 | 5.305 | 0.3219 | -2.734 | 0.0991 | 29.011 | 0.0066 | |
| -5.865 | 0.4535 | 2.1339 | 0.1053 | -8.787 | 0.1813 | 46.617 | 0.0021 | |
| 54.67 | 1.1589 | 0.1794 | 0.0080 | 59.431 | 0.3907 | 8.9044 | 0.0002 | |
| eCRP | CRP | |||||||
| Mean | Std Dev | RE | RSDCC | Mean | Std Dev | RE | RSDCC | |
| 1.158 | 0.0793 | 3.13 | 0.0919 | 1.129 | 0.0781 | 5.4895 | 0.0562 | |
| -1.999 | 0.1306 | 5.701 | 0.0823 | -2.415 | 0.1109 | 13.9646 | 0.0254 | |
| -6.21 | 0.1777 | 3.625 | 0.0161 | -8.853 | 0.2198 | 47.7203 | 0.0074 | |
| 54.546 | 0.4756 | 0.0469 | 0.0015 | 59.807 | 0.4267 | 9.5925 | 0.0006 | |
| eIPA | IPA | |||||||
|---|---|---|---|---|---|---|---|---|
| Mean | Std Dev | RE | RSDCC | Mean | Std Dev | RE | RSDCC | |
| -67.73 | 1.17 | 1.31 | 1.6801 | -65.2 | 0.8 | 5 | 0.2886 | |
| -2982.2 | 10.6 | 0.078 | 0.0193 | -2821.8 | 7.66 | 5.3 | 0.0053 | |
| 145.36 | 1 | 0.22 | 0.2880 | 131.04 | 0.73 | 9.66 | 0.0623 | |
| 259.45 | 8.86 | 0.92 | 2.0139 | 250.4 | 8.2 | 2.6 | 0.7723 | |
| -119.38 | 1.01 | 0.13 | 0.4097 | -90.78 | 0.74 | 24.1 | 0.1251 | |
| -30.38 | 7.82 | 8.98 | 104.45 | -23.45 | 2.97 | 15.75 | 11.484 | |
| eCFD | CFD | |||||||
| Mean | Std Dev | RE | RSDCC | Mean | Std Dev | RE | RSDCC | |
| -633.79 | 6.21 | 823.5 | 0.0334 | -621.15 | 2.1 | 805.1 | 0.0017 | |
| -2987.1 | 10.01 | 0.24 | 0.0039 | -2891.5 | 7.16 | 2.97 | 0.0009 | |
| 356.95 | 22.3 | 146.1 | 1.3379 | 206.2 | 5.3 | 42.19 | 0.0972 | |
| 265.69 | 4.59 | 3.34 | 0.1019 | 250.5 | 1.43 | 2.5 | 0.0048 | |
| -51.61 | 10.7 | 56.8 | 14.764 | -22.8 | 1.16 | 80.9 | 0.3845 | |
| -31.74 | 4.98 | 13.85 | 8.407 | -24.61 | 1.42 | 11.72 | 0.4871 | |
| eCRP | CRP | |||||||
| Mean | Std Dev | RE | RSDCC | Mean | Std Dev | RE | RSDCC | |
| -648.1 | 2.38 | 844.3 | 0.0039 | -620.9 | 2.1 | 804.7 | 0.0028 | |
| -3076.6 | 10.5 | 3.2 | 0.0033 | -2897.2 | 7.5 | 2.8 | 0.0016 | |
| 349.55 | 4.18 | 141 | 0.041 | 216.6 | 5.09 | 49.4 | 0.1315 | |
| 260.01 | 1.23 | 1.14 | 0.0064 | 251.7 | 1.41 | 2.1 | 0.0075 | |
| -41.29 | 0.6 | 65.5 | 0.0602 | -21.91 | 1.16 | 81.7 | 0.6639 | |
| -33.98 | 0.52 | 21.88 | 0.0666 | -23.78 | 0.91 | 14.7 | 0.3494 | |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
**Estimation of parameter sensitivities for stochastic reaction networks using tau-leap simulations **
Ankit Gupta [email protected] Department of Biosystems Science and Engineering, ETH Zurich, Mattenstrasse 26, 4058 Basel, Switzerland.
Muruhan Rathinam [email protected] Department of Mathematics and Statistics, University of Maryland Baltimore County, 1000 Hilltop Circle, Baltimore, MD 21250, U.S.A.
Mustafa Khammash [email protected] Department of Biosystems Science and Engineering, ETH Zurich, Mattenstrasse 26, 4058 Basel, Switzerland.
Abstract
We consider the important problem of estimating parameter sensitivities for stochastic models of reaction networks that describe the dynamics as a continuous-time Markov process over a discrete lattice. These sensitivity values are useful for understanding network properties, validating their design and identifying the pivotal model parameters. Many methods for sensitivity estimation have been developed, but their computational feasibility suffers from the critical bottleneck of requiring time-consuming Monte Carlo simulations of the exact reaction dynamics. To circumvent this problem one needs to devise methods that speed up the computations while suffering acceptable and quantifiable loss of accuracy. We develop such a method by first deriving a novel integral representation of parameter sensitivity and then demonstrating that this integral may be approximated by any convergent tau-leap method. Our method is easy to implement, works with any tau-leap simulation scheme and its accuracy is proved to be similar to that of the underlying tau-leap scheme. We demonstrate the efficiency of our methods through numerical examples. We also compare our method with the tau-leap versions of certain finite-difference schemes that are commonly used for sensitivity estimations.
Keywords: parameter sensitivity; reaction networks; Markov process; tau-leap simulations
Mathematical Subject Classification (2010): 60J22; 60J27; 60H35; 65C05.
1 Introduction
The study of chemical reaction networks is an essential component of the emerging fields of Systems and Synthetic Biology [1, 44, 17]. Traditionally chemical reaction networks were modeled in the deterministic setting, where the dynamics is represented by a set of ordinary differential equations (ODEs) or partial differential equations (PDEs). In the study of intracellular chemical reactions, some chemical species are present in low copy numbers. Since the behavior of individual molecules is best described by a stochastic process, in the low molecular copy number regime, the copy numbers of the molecular species itself is better modeled by a stochastic process than by ODEs [19]. Only in the limit of large molecular copy numbers, one expects the deterministic models to be accurate [3]. While our work in this paper is focused on biochemical reaction networks as primary examples, we emphasize that the mathematical framework of reaction networks can also be used to describe a wide range of other phenomena in fields such as Epidemiology [28] and Ecology [8].
Suppose is a parameter (like ambient temperature, cell-volume, ATP concentration etc.) that influences the rate of firing of reactions. Let be the -dependent Markov process representing the reaction dynamics, and suppose that for some real-valued function and observation time , our output of interest is . This output is a random variable and we are interested in determining the sensitivity of its expectation w.r.t. infinitesimal changes in the parameter . We define this sensitivity value, denoted by , as the partial derivative
[TABLE]
Determining these parametric-sensitivity values are useful in many applications, such as, understanding network design and its robustness properties [42], identifying critical reaction components, inferring model parameters [16] and fine-tuning a system’s behavior [15].
Generally the sensitivities of the form (1.1) cannot be directly evaluated, but instead, they need to be estimated with Monte Carlo simulations of the dynamics . Many methods have been developed for this task [23, 33, 40, 41, 2, 25, 26], but they all rely on exact simulations of that can be performed using schemes such as Gillespie’s stochastic simulation algorithm (SSA) [19]. This severely constrains the computational feasibility of these sensitivity estimation methods because these exact simulations become highly impractical if the rate of occurrence of reactions is high [21], which is typically the case. The main difficulty is that that exact simulation schemes keep track of each reaction event which is very time-consuming. To avoid this problem, tau-leaping methods have been developed that proceed by combining many reaction-firings over small time intervals [20]. Tau-leap methods have been shown to produce good approximations of the reaction dynamics, at a small fraction of the computational cost of exact simulations [20, 11, 36, 43, 5, 39, 47, 48, 29, 32]. Their accuracy and stability has also been investigated theoretically in many papers [38, 31, 6, 35, 37].
Our goal in this paper is to develop a method that takes advantage of the computational efficiency of tau-leap methods for the purpose of estimating sensitivity values of the form (1.1). Since tau-leap methods introduce a bias in the estimation, it is highly desirable to start with an unbiased method for computing sensitivities (instead of biased methods such as the Finite Difference (FD)) and then replace exact SSA simulations by a suitable tau-leap method. Having only one form of bias, modulated by the tau-leap step size, allows one to control the bias more effectively and also facilitates the design of multilevel strategies that eliminate or reduce the estimator bias and enhance its computational efficiency [7, 30, 32]. Among the existing methods in the literature, only the Girsanov Transformation (GT) method [22, 33], the Auxiliary Path Algorithm(APA)[25] and the Poisson Path Algorithm (PPA) [26] are unbiased. Since the GT method in general suffers from large variance [26, 2, 25, 40, 41, 45] and the APA/PPA methods are not directly amenable to tau-leap approximation, we develop a variant of the PPA method in which exact SSA simulations are replaced by tau-leap simulations. Our method, called Tau Integral Path Algorithm (IPA), works with any underlying tau-leap simulation scheme and it is based on a novel integral representation of parameter sensitivity that we derive in this paper. We provide computational examples that show that using IPA we can often trade-off a small amount of bias for large savings in the overall computational costs for sensitivity estimation. We prove that the bias incurred by IPA depends on the step-size in the same way as the bias of the tau-leap scheme chosen for simulations. Moreover if we substitute the tau-leap simulations in IPA with the exact SSA generated simulations, then we obtain a new unbiased method for sensitivity estimation which we call the ‘exact’ IPA or eIPA that is similar to the PPA method in [26]. Two main reasons for the high variance of the GT method that have been identified in the existing literature are: 1) low magnitude of the sensitivity parameter (see [26, 25]) and 2) large system-size or volume under the classical volume scaling of the reaction network [45]. The second issue is somewhat resolved by the centered Girsanov Transformation (CGT) method [45, 52] and our numerical results indicate that the volume scaling behavior of eIPA is similar to CGT (see Section 4.1). However eIPA does not suffer from high variance when the sensitivity parameter is small. In addition, when , GT or CGT methods are not even applicable while eIPA does not suffer from this restriction. These observations make eIPA more appealing than CGT for unbiased estimation of parameter sensitivity.
For the sake of comparison, we use tau-leap versions of certain commonly used finite-difference estimators (see [2, 40, 51]) that approximate the infinitesimal derivative in (1.1) by a finite-difference (see (2.11)). Such estimators are computationally faster than IPA (in simulation time per trajectory) but they suffer from two sources of bias (finite-differencing and tau-leap approximations) unlike IPA which only incurs bias from the latter source. We note that while in some examples the biases nearly cancel each other fortuitously, as a general principle one has no logical reason to expect such cancellation.
This paper is organized as follows. In Section 2 we describe the stochastic model for reaction dynamics and the sensitivity estimation problem. We also discuss the existing sensitivity estimation methods, the tau-leap simulation schemes and explain the rationale for using such simulations in sensitivity estimation. Section 3 contains the main results of this paper which include a novel integral representation of the exact sensitivity in Section 3.2, a result on error bounds for the sensitivity estimates of IPA in Section 3.3 and the novel tau-leap sensitivity estimation method IPA in Section 3.4. In Section 4 we provide computational examples to compare our method with other methods and finally in Section 5 we conclude and provide directions for future research.
2 Preliminaries
Consider a reaction network with species and reactions. We describe its kinetics by a continuous time Markov process whose state at any time is a vector in the non-negative integer orthant comprising of the molecular counts of all the species. The state evolves due to transitions caused by the firing of reactions. We suppose that when the state is , the rate of firing of the -th reaction is given by the propensity function and the corresponding state-displacement is denoted by the stoichiometric vector . There are several ways to represent the Markov process that describes the reaction kinetics under these assumptions. We can specify the generator (see Chapter 4 in [14]) of this process by the operator
[TABLE]
where is any bounded real-valued function on . Alternatively we can express the Markov process directly by its random time-change representation (see Chapter 7 in [14])
[TABLE]
where is a family of independent unit rate Poisson processes. Since the process is Markovian, it can be equivalently specified by writing the Kolmogorov forward equation for the evolution of its probability distribution at each state :
[TABLE]
This set of coupled ordinary differential equations (ODEs) is termed as the Chemical Master Equation (CME) in the biological literature [3]. As the number of ODEs in this set is typically infinite, the CME is nearly impossible to solve directly, except in very restrictive cases. A common strategy is to estimate its solution with pathwise simulations of the process using Monte Carlo schemes such as Gillespie’s SSA [19], the next reaction method [18], the modified next reaction method [4], and so on. While these schemes are easy to implement, they become computationally infeasible for even moderately large networks, because they account for each and every reaction event. To resolve this issue, tau-leaping methods have been developed which will be described in greater detail in Section 3.1.
We now assume that each propensity function depends on a real-valued system parameter . To emphasize this dependence we write the rate of firing of the -th reaction at state as instead of . Let be the Markov process representing the reaction dynamics with these parameter-dependent propensity functions. As stated in the introduction, for a function and an observation time , our goal is to determine the sensitivity value defined by (1.1). This value cannot be computed directly for most examples of interest and so we need to find ways of estimating it using simulations of the process . Such simulation-based sensitivity estimation methods work by specifying the construction of a random variable whose expected value is “close” to the true sensitivity value , i.e.
[TABLE]
Once such a construction is available, a large number (say ) of independent realizations of this random variable are obtained and the sensitivity is estimated by computing their empirical mean as
[TABLE]
This estimator is a random variable with mean and variance
[TABLE]
respectively, where . For a large sample size , the distribution of is approximately Gaussian with mean and variance , due to the Central Limit Theorem. The standard deviation measures the statistical spread of the estimator , that is inversely proportional to its statistical precision. The sample size must be large enough to ensure that is small relative to , i.e. for some small parameter , we should have
[TABLE]
where is the relative standard deviation of the random variable . If such a condition holds, then is a reliable estimator for the true sensitivity value because it is very likely to assume a value close to its mean which in turn is close to (see (2.5)). In practice both and are unknown, but we can estimate them as and where
[TABLE]
is the estimated standard deviation of the estimator.
The performance of any sensitivity estimation method (say ) depends on the following three key metrics that are based on the properties of random variable :
The bias , which is the error incurred by the approximation (2.5). 2. 2.
The variance of random variable . 3. 3.
The computational cost of generating one sample of .
The bias can be positive or negative, and its absolute value can be seen as the upper-bound on the statistical accuracy that can be achieved with method by increasing the sample size [9]. As mentioned before, the standard deviation measures the statistical precision of the method and its magnitude relative to the mean determines the number of samples that are needed to produce a reliable estimate. In particular, to satisfy condition (2.8) for the relative standard deviation , the number of samples needed would be around . Hence the total cost of the estimation procedure is
[TABLE]
where is the CPU time required for constructing one realization of . The goal of a good estimation method is to simultaneously minimize the three quantities , and . This creates various conflicts and trade-offs among the existing sensitivity estimation methods as we now discuss.
2.1 Biased methods
A sensitivity estimation method is called biased if . The most commonly used biased methods are the finite-difference schemes which approximate the infinitesimal derivative in the definition of parameter sensitivity (see (1.1)) by a finite-difference of the form
[TABLE]
for a small perturbation . The processes and represent the Markovian reaction dynamics with values of the sensitive parameter set to and respectively. These two processes can be simulated independently [23] but it is generally better to couple them in order to reduce the variance of the associated estimator. The two commonly used coupling strategies are called Common Reaction Paths (CRP) [40] and Coupled Finite Differences (CFD) [2] and they are based on the random time-change representation (2.3).
The finite-difference approximation (2.11) for the true sensitivity value can be expressed as the expectation of the following random variable
[TABLE]
The three metrics (bias, variance and computational cost) based on this random variable define the performance of CRP and CFD. Since both these methods estimate the same quantity , they have the same bias (i.e. ). However in many cases it is found that the CFD coupling is tighter than the CRP coupling, resulting in a lower variance of (i.e. ) (see [2]). For each realization of , both CRP and CFD require simulation of a coupled trajectory in the time interval . The computational costs of such a simulation is roughly , where is the cost of exactly simulating the process using Gillespie’s SSA [19] or a similar method.111In fact the cost of generating a realization of is usually smaller for CFD in comparison to CRP (i.e. ), because the CFD coupling is such that if for some , then this equality will hold for the remaining time-interval , allowing us to directly set without completing the simulation in the interval .
Finite-difference schemes introduce a bias in the estimate whose size is proportional to the perturbation value (i.e. ), but the constant of proportionality can be quite large in many cases, leading to significant errors even for small values of [26]. Unfortunately we cannot circumvent this problem by choosing a very small because the variance is proportional to (i.e. ). Therefore if a very small is selected, the variance will be enormous and the sample-size required to produce a statistically precise estimate will be very large, imposing a heavy computational burden on the estimation procedure [26]. This trade-off between bias and variance is the main drawback of finite-difference schemes and there does not exist a strategy for selecting that optimally balances these two quantities. Note that unlike bias and variance, the computational cost of generating a sample (i.e. or ) does not change significantly with , thereby ensuring that regardless of , the total computational burden varies linearly with the required number of samples . Apart from finite-difference schemes, there exists another biased method, called the regularized pathwise-derivative method [41] for estimating the sensitivity value (1.1), but we do not discuss this approach in this paper.
2.2 Unbiased methods
A sensitivity estimation method is called unbiased if . The main advantage of unbiased methods is that the estimation can in principle be made as accurate as possible by increasing the sample size . The first unbiased method for sensitivity estimation is called the Girsanov Transformation (GT) method [22, 33], which works by estimating the -derivative of the probability distribution of . The GT method is easy to implement and the computation cost of generating each sample is roughly – the cost of exact simulation of the process . The main issue with the GT method is that generally the variance of its associated random variable is very large and so the number of samples needed to obtain a statistically precise estimate is very high [2, 40]. So far two reasons have been identified for this behavior. Firstly, it has been shown that for mass-action models (see [3]) this variance can become unbounded when the magnitude of the sensitive reaction rate-constant approaches zero [26, 25]. This is a serious issue because biological networks often consist of slow reactions which are characterized by low values of the associated rate-constants. Furthermore the GT method does not allow one to estimate the sensitivity w.r.t. a rate-constant set to zero. Such sensitivity values are useful for understanding network design as it allows one to probe the effect of presence or absence of reactions. Another reason for the high variance of GT estimator was provided in [45] where it was theoretically established that this variance can grow boundlessly as the system expands in size, i.e. the system volume tends to infinity. This issue is somewhat ameliorated by the centered Girsanov Transformation (CGT) method [52] but the problem with small reaction rate-constants persists.
We now discuss a couple of unbiased methods that have been recently proposed. These methods are called the Auxiliary Path Algorithm(APA )[25] and the Poisson Path Algorithm (PPA) [26], and they are based on exact representations of the form (2.5) for the parameter sensitivity (1.1). For both the methods, sampling the random variable requires simulation of a fixed number of additional paths of the process . It was shown in [25] that in comparison to the GT method, the computational cost of generating each sample for APA is much higher (i.e. ) but this is often compensated by the fact that its variance is much lower (i.e. ), resulting in a smaller overall cost of estimation (2.10). The reason for the higher sampling cost for APA is that it needs estimates of certain unknown quantities at each jump-time of the process in the time interval , which can be very large in number even for small networks. In PPA, this problem is resolved by randomly selecting a small number of these unknown quantities for estimation in such a way that the estimator remains unbiased. Due to this extra randomness, the sample variance for PPA is generally greater than APA (i.e. ) but the computational cost for realizing each sample is much lower (i.e. ). Moreover in comparison to APA, PPA is far easier to implement and has lower memory requirements, making it an attractive unbiased method for sensitivity estimation. In [26] it is shown using many examples that for a given level of statistical accuracy, PPA can be more efficient than GT and also the finite-difference schemes CFD and CRP. The computational cost of generating each sample in PPA is roughly , where is a small number that upper-bounds the expected number of unknown quantities that will be estimated using additional paths. For both APA and PPA, the parameter serves as a trade-off factor between the computational cost and the variance - as increases, the cost also increases but the variance decreases. However both these methods remain unbiased for any choice of .
The foregoing trade-off relationships for the existing sensitivity estimation methods are summarized in Table 1.
2.3 Rationale for using tau-leap schemes for sensitivity estimation
All the existing sensitivity estimation methods suffer from a critical bottleneck – they are all based on exact simulations of the process . The computational cost of generating each trajectory of can be exorbitant even for moderately large networks when those networks have some molecular species in moderately large copy numbers and/or reactions firing at multiple timescales (stiff systems). One way to counter this problem is to develop methods that can accurately estimate parameter sensitivities with approximate computationally inexpensive simulations of the process obtained with tau-leap methods. The use of tau-leap simulations provides a natural way to trade-off a small amount of error with a potentially large reduction in the computational costs.
The explicit tau-leap method with Poisson random numbers proposed by Gillespie [20] generally works well in non-stiff situations and when molecular copy numbers are modestly large. The major drawback is that it becomes inefficient for stiff systems where vastly different time scales are present. The implicit tau-leap was proposed to remedy this weakness [36]. Many other tau-leap methods and step size selection strategies have been proposed to address stiffness and other issues [11, 43, 5, 39, 48, 47, 32].
In the context of stiff systems, tau-leap methods have not been as successful in maintaining accuracy while reducing computational cost in comparison with the success of stiff solvers for deterministic differential equations. This is because stiffness manifests in a more complex manner in stochastic systems where stability is not the only issue, but accurately capturing the asymptotic distribution of the fast variables is also important [36, 37, 49, 50]. We shall limit our attention to non-stiff or modestly stiff systems in this paper.
Our goal in this paper is to develop a method that can estimate parameter sensitivity of the form (1.1) using only tau-leap simulations of the process . This can be done by specifying a random variable which can be constructed with these tau-leap simulations and whose expected value is “close” to the true sensitivity value , i.e.
[TABLE]
We propose such a random variable in this paper and provide a simple algorithm for generating the realizations of . We theoretically show that under certain reasonable conditions, the associated estimator is tau-convergent, which means that the bias incurred due to the approximation in (2.12) converges to [math], as the maximum step-size or the coarseness of the time-discretization mesh goes to [math]. Hence by making this mesh finer and finer, we can make the estimator as accurate as we desire, provided that we are willing to bear the increasing computational costs. In the context of estimating expected values , the property of tau-convergence along with the rate of convergence, has already been established for many tau-leap schemes [38, 31, 6, 35]. We use these pre-existing results and obtain a similar tau-convergence result for our sensitivity estimation method. An important feature of our approach is that it is completely flexible, as far as the choice of the tau-leap simulation method is concerned. Furthermore the order of accuracy of our sensitivity estimation method is the same as the order of accuracy of the underlying tau-leap method.
We end this section with observing that incorporating tau-leap schemes in sensitivity estimation opens up a new dimension in attacking this challenging problem. In the trade-off relationships for existing sensitivity estimation methods (see Table 1) parameters like and only allow us to explore one trade-off curve between the variance and some other metric like the bias (for = CRP, CFD) or the computational cost (for = APA, PPA). The main advantage of employing tau-leap schemes is that they provide a mechanism for exploring another trade-off curve between the bias and the computational cost , for the purpose of optimizing the performance of a sensitivity estimation method. In Section 4, we provide numerical examples to show that with tau-leap simulations we can indeed trade-off a small amount of bias with large savings in the computational effort required for estimating parameter sensitivity. Moreover this trade-off relationship appears to be independent of existing trade-off relationships mentioned in Table 1 because replacing exact simulations in a sensitivity estimation method, with approximate tau-leap simulations, usually does not alter the variance significantly at least when the tau step size is sufficiently small (see Section 4). Of course, the computational advantage of tau-leap schemes can only be appropriated if we can incorporate them into existing sensitivity estimation methods. The main contribution of this paper is to develop a method, similar to PPA, that works well with tau-leap schemes (see Section 3). For the sake of comparison, we also provide tau-leap versions of the finite-difference schemes (CRP and CFD) in Section 4.
3 Sensitivity estimation with tau-leap simulations
In this section we present our approach for accurately estimating parameter sensitivities of the form (1.1) with only approximate tau-leap simulations of the dynamics. This approach is based on an exact integral representation for parameter sensitivity given in Section 3.2. With this representation at hand, we construct a tau-leap estimator for parameter sensitivity and examine its convergence properties as the time-discretization mesh gets finer and finer (see Sections 3.3 and 3.4). Thereafter in Section 3.5 we present an algorithm that computes the tau-leap estimator for sensitivity estimation. We start with the description of a generic tau-leap method that approximately simulates the stochastic reaction paths defined by the Markov process with generator (see (2.2)) and initial state .
3.1 A generic tau-leap method
For each reaction , let be the number of firings of reaction until time . Due to (2.3) we can express each as
[TABLE]
where is a family of independent unit rate Poisson processes. From now on we refer to as the reaction count vector. For any two time values (with ), the states at these times satisfy At any given time and the computed (approximate) state at time , a tau-leap method entails taking either a predetermined step of size or choosing step-size as a function of the current state and time, i.e. step-size selection is adapted to the information sigma-algebra generated by the tau-leap process. Next an approximating distribution for the state at time is generated. This distribution is generally found by approximating the difference in the reaction count vector by a random variable whose probability distribution is easy to sample from. The most straightforward choice is given by the simple (explicit) Euler method [20], which assumes that the propensities are approximately constant in the time interval and conditioned on the information at time , each is an independent Poisson random variable with rate . Other distributions for have also been used in the literature to obtain better approximations and particularly to prevent the state-components from becoming negative [43]. The selection method for step size also varies, with the simplest being steps based on a deterministic mesh over the observation time interval . To obtain better accuracy several strategies have been proposed that randomly select based on some criteria such as avoidance of negative state-components or constancy of conditional propensities [11, 5, 32].
To represent a generic tau-leap method we shall use a pair of abstract labels and , where denotes a method, i.e. a choice of distribution for , and denotes a step size selection strategy. We will use as a (deterministic) parameter which quantifies the coarseness of the time-discretization scheme . For instance may stand for the explicit Euler tau-leap method [20] and may stand for a deterministic mesh , and in this case the coarseness parameter is . Typically, tau-leap methods produce approximations of the underlying process at certain leap times that are separated by the step-size and one can interpolate these approximate state values at other time points. The most obvious interpolation is the “sample and hold” method, where the tau-leap process is held constant between the consecutive leap times. In circumstances, such as the explicit Euler tau-leap method with Poisson updates, it is more natural to use interpolation strategies based on the random time-change representation (2.3) – for example see the “Poisson bridge” approach in [29]. In the following discussion, we suppose that the interpolation strategy is also determined by the label . We shall use to denote the tau-leap process, that approximates the exact dynamics , and that results from the application of a tau-leap method with step size selection strategy . This process is defined by the prescription and
[TABLE]
where is the (possibly) random number of time points, are the (possibly) random leap times, and for and are random variables whose distribution when conditioned on is determined by the method and step size strategy .
Remark 3.1
Note that this generic tau-leap method reduces to Gillespie’s SSA [19], if at state , the next step size is an exponentially distributed random variable with rate and each is chosen as if and [math] otherwise, where is a discrete random variable which assumes the value with probability .
Later we shall establish tau-convergence of our sensitivity estimator by showing that for a fixed tau-leap method , the bias incurred by our estimator converges to [math] as the coarseness of the time-discretization scheme goes to [math]. For this we shall require (weak) convergence of all moments of the tau-leap process to those of the exact process. We now state this requirement more precisely and present a simple lemma that will be needed later. For , we say that a function is of class if there exists a positive constant such that
[TABLE]
We shall require that a tau-leap method satisfies an order convergent error bound. This is stated formally by Assumption 1 and it can be verified using the results in [35].
Assumption 1 Given a tau-leap method , there exist , and a mapping such that, for every and every final time , there exists a constant satisfying
[TABLE]
for any initial state provided that . Note that here the second inequality is our assumption while the first inequality always holds. In above, we have assumed that there is a common probability space carrying the exact process and the tau-leap process .
Remark 3.2
We observe that Assumption 1 essentially assumes order convergence in the so-called -th moment variation norm (see [35]) of the probability law of (on ) to the probability law of (on ) and it is not as restrictive as it might seem at first glance. The -th moment variation norm of a signed finite measure on which possesses a finite -th moment is defined by
[TABLE]
and the space defined by
[TABLE]
is isometrically isormorphic to , the space of absolutely summable sequences and moreover, is the dual space of (see [35]). We note that by the Schur property, weak convergence implies norm convergence in . In Assumption 1, if we merely assumed weak convergence of order in , due to the Schur property, we obtain convergence in -th moment variation norm of order for any . Moreover, we note that convergence of tau-leap methods in the moment variation norms have been derived in [35] and apply to a large class of situations including (but not limited to) systems that remain in a bounded subset of the integer state space. We also remark that to our best knowledge, all convergence results on tau-leaping have been limited to considering determinstic time steps. However, in the applied literature, adaptive time step selection methods have been explored numerically, and it is reasonable to expect convergence results to be established in the future for a reasonable class of adaptive step size selection schemes. In this paper, our numerical simulations are restricted to deterministic time steps.
Additionally we will require Assumptions 2 and 3 on moment growth bounds of the exact process as well as the tau-leap process. These assumptions can be verified using the results in [34, 24, 35].
Assumptions 2 and 3 Given a tau-leap method , there exists such that for each and there exist constants and satisfying
[TABLE]
for all , provided .
We emphasize that constants and in Assumptions 1 and 3, do not depend on the step-size selection strategy , and all the three constants in these assumptions may be assumed to be monotonic in without any loss of generality. The following lemma follows readily from the above assumptions.
Lemma 3.3
Consider a function and suppose that there exists a constant such that for all . Then under Assumptions 1,2 and 3, we have
[TABLE]
provided .
3.2 An integral formula for parameter sensitivity
Let be the Markov process representing reaction dynamics with initial state and let be defined by
[TABLE]
for any state and time . For any and any function , let denote the difference operator given by
[TABLE]
The following theorem expresses the sensitivity value as the expectation of a random variable which can be computed from the paths of the process in the time interval . The proof of this theorem is provided in the Appendix 5.1.
Theorem 3.4
Suppose is the Markov process with generator and initial state . Then the sensitivity value is given by
[TABLE]
Remark 3.5
This formula has the following simple interpretation. Due to an infinitesimal perturbation of parameter , the probability that the process has an “extra” jump at time in the direction is proportional to
[TABLE]
Moreover the change in the expectation of at time due to this “extra” jump at time is just
[TABLE]
The above result shows that the overall sensitivity of the expectation of is just the product of these two terms, integrated over the whole time interval .
The rest of this section is devoted to the development of a tau-leap estimator for parameter sensitivity using this formula. To simplify our notations, we suppress the dependence on parameter , and hence denote by , by , by , by and the process by . Due to Theorem 3.4 the sensitivity value can be expressed as
[TABLE]
3.3 Sensitivity approximation with tau-leap simulations
In order to construct a tau-leap estimator for parameter sensitivity using formula (3.18), we need to replace both and with approximations derived with tau-leap simulations. Recall from Section 3.1 that a generic tau-leap scheme can be described by a pair of abstract labels and , specifying the method and the step-size selection strategy respectively. Assuming such a tau-leap scheme is chosen, let the corresponding tau-leap process (see (3.13)) be an approximation for the exact dynamics starting at state .
Suppose that we use the tau-leap method with the step-size selection strategy to approximate and possibly a different tau-leap method with a time-dependent step-size selection strategy to compute an approximation of . This time-dependence in step-size selection is needed because the latter quantity requires simulation of auxiliary tau-leap paths in the interval which varies with . We discuss this in greater detail in the next section. In the following discussion, we will assume that both the tau-leap schemes and satisfy Assumptions 1,2 and 3, with common and with replaced by the supremum step-size
[TABLE]
which is less than . We define the tau-leap approximation of (see (3.17)) by
[TABLE]
and make the assumption that the step size selection strategy depends on in such a way that is a measurable function of . Motivated by formula (3.18), we shall approximate the true sensitivity value by
[TABLE]
where is the starting state of the process . The next theorem, proved in the Appendix 5.1, shows that the bias of this sensitivity approximation is similar to the bias of the underlying tau-leap scheme. In particular if the tau-leap method satisfies order convergent error bound, then the same is true for the error incurred by the sensitivity approximation. Before we state the theorem, recall that for any , a function is in class if it satisfies (3.14) for some constant .
Theorem 3.6
Let as well as for each be of class for some . Suppose that a tau-leap approximation of the exact sensitivity is computed by (3.21), where a tau-leap method with step size strategy is used to approximate the underlying process and possibly a different tau-leap method with time-dependent step size strategy is used to compute approximations of at each . If both the tau-leap methods satisfy Assumptions 1,2 and 3, with common and , then there exists a constant such that
[TABLE]
where is given by (3.19) and it is less than .
We remark that there are two forms of error analyses in the literature for tau-leap methods. The first type is more conventional where the analysis is carried out for a given system in an interval as . See [38, 31, 35]. An alternative analysis considers a family of systems parametrized by “system size” , where step size is chosen in relation to as (where ), and the limit considered as [6]. As pointed out in [35] both analyses are useful. The first type of analysis with fixed system size is important in that if convergence or more importantly zero-stability (see [35]) does not hold in this conventional sense, then the computed solution can be very erroneous not only when the step size is too large, but also when it is too small! On the other hand, the system size scaling analysis helps explains why tau-leap remains efficient while leaping over several reaction events. In the interest of space, we limit ourselves to the first type in this paper.
3.4 A tau-leap estimator for parameter sensitivity
We now come to the problem of estimating the sensitivity approximation using tau-leap simulations. Expression (3.21) shows that is the expectation of the random variable defined by
[TABLE]
If we can generate samples of this random variable, then the estimation of would be quite straightforward using (2.6). However this is not the case as the random variable is nearly impossible to generate. This is mainly because it requires computing quantities of the form
[TABLE]
at infinitely many time points . These quantities generally do not have an explicit formula and hence they need to be estimated via auxiliary Monte Carlo simulations, which severely restricts the number of such quantities that can be feasibly estimated. We tackle these problems by constructing another random variable whose expected value equals , and whose samples can be easily generated using a simple procedure called IPA (Tau Integral Path Algorithm) that is described in Section 3.5. This random variable is constructed by adding randomness to the random variable in such a way that only a small finite number of unknown quantities of the form (3.23) require estimation. We now present this construction.
Construction of the random variable : Recall from Section 3.1 the description of the tau-leap process which approximates the exact dyamics . Let be the (possibly random) mesh corresponding to step size selection strategy . We denote the -algebra generated by the process and the random mesh over the interval by . Let and let be the positive integer given by
[TABLE]
where is a positive constant and denotes the smallest integer greater than or equal to . The choice of and its role will be explained later in the section. Define for each , where each is an independent random variable with distribution . Thus given and , the distribution of each is . Moreover taking expectation over the distribution of -s we get
[TABLE]
In deriving the last equality we have used the substitution . This relation along with (3.21) yields
[TABLE]
using linearity of the expectation operator. To obtain the states for all the -s, we need to interpolate the tau-leap dynamics between the times and .
To proceed further we define a “conditional estimator” of the quantity (3.23) at by
[TABLE]
where , and and are instances of tau-leap approximations of the exact dynamics starting at initial states and respectively. Both these tau-leap processes use the same method and the same step-size selection strategy . Moreover conditioned on and , the processes and the step-size selection strategy are independent of the process and the step-size selection strategy . Therefore it is immediate that
[TABLE]
and hence from (3.25) we obtain the following representation for
[TABLE]
An estimator for based on this formula can require several computations of . Since each evaluation of is computationally expensive, we would like to control the total number of these evaluations by randomizing the decision of whether should be evaluated at time or not. Moreover this randomization must be performed without introducing a bias in the estimator. We now describe this process.
Define and by
[TABLE]
and let be an independent -valued random variable whose distribution is Bernoulli with parameter . Since we have that
[TABLE]
where we define to be [math] when . This formula suggests that can be estimated, without any bias, using realizations of the random variable
[TABLE]
In generating each realization of , the computation of is only needed if the Bernoulli random variable is . Therefore, if we can effectively control the number of such -s then we can efficiently generate realizations of . This can be achieved using the positive parameter (see (3.24) and (3.29)) as we soon explain. Based on the construction outlined above, we provide a method in Section 3.5 for obtaining realizations of the random variable . We call this method, the Tau Integral Path Algorithm (IPA), to emphasize the fact that is essentially an approximation of the integral (3.22). Using IPA we can efficiently generate realizations of and approximately estimate the parameter sensitivity with the estimator (2.6).
Minimizing the variance of : To improve the efficiency of IPA, we must minimize the additional variance due to the extra randomness that has been added to the random variable (3.22) to obtain . Since , this additional variance is equal to , and in order to reduce this quantity we focus on reducing the conditional variance . Recall that is given by (3.26) and for convenience we abbreviate by for . The reduction in this conditional variance can be accomplished by tightly coupling the pair of processes . For this purpose we use the split-coupling (see [2]) specified by
[TABLE]
where is an independent family of unit rate Poisson processes. Here for , and is the sequence of leap-times of the pair of processes jointly simulated with the tau-leap scheme . Note that process is adapted to the filtration generated by processes . Hence a solution to (3.32)-(3.33) can be found by explicit construction. The uniqueness of the solution , until the first time its norm exceeds some constant , is guaranteed by the local boundedness of the associated generator (see Theorem 4.1 in Chapter 4 of [14]). Using Assumption 3 one can show that as we have a.s. and from this, the uniqueness of the solution in the whole time-interval can be established. See Lemma A.1 in [27] for more details on this argument.
Controlling the number of nonzero -s: We now discuss how the positive parameter can be selected to control the total number of -s that assume the value in (3.31), which is . This is the number of -s that are required to obtain a realization of . It is immediate that given the sigma field , is a -valued random variable whose expectation is given by:
[TABLE]
Using and
[TABLE]
we obtain
[TABLE]
We choose a positive integer and set
[TABLE]
where the expectation can be approximately estimated using tau-leap simulations of the dynamics in the time interval . Such a choice ensures that is bounded above by on average. In most cases we can expect that to be close to and so the choice of automatically ensures that . Hence inequality (3.34) is almost exact and with chosen as (3.35) we have . Therefore can be interpreted as the expected number of coupled auxiliary paths (3.32)-(3.33) needed to obtain a realization of . This parameter is in the hands of the user and it plays the same role as in PPA (see Section 2.2), namely, it allows one to select the trade-off between the computational cost and the variance . A higher value of reduces the variance while simultaneously increasing the computational cost. Hence it is difficult to ascertain the effect of on the overall estimation cost which depends on the product (see (2.10)). Numerical examples suggest that for low values of , the overall estimation cost decreases gradually with increase in , but this trend reverses for higher values of (see Section 4). More work is needed to examine if this pattern persists more generally and how one can select the optimal value of . Note however that IPA will provide an unbiased estimator for (3.21) regardless of the choice of . Hence the accuracy of IPA does not vary much with , which is also seen in the numerical examples.
3.5 The Tau Integral Path Algorithm (IPA)
We now provide a detailed description of the method IPA which produces realizations of the random variable defined by (3.31). Computing the empirical mean (2.6) of these realizations estimates the approximate parameter sensitivity . Throughout this section we assume that the function returns independent samples from the distribution .
The method IPA can be adapted to work with any tau-leap scheme, but for concreteness, we assume that an explicit tau-leap scheme is used for all the simulations. This means that the current state and time , are sufficient to determine the distributions of the next time-step and the vector of reaction firings in the time interval . We suppose that a sample from these two distributions can be obtained using the methods 222We allow the step-size selection to depend on both the current time and the final time . This is especially important for simulating the auxiliary paths that are required to compute the -s in (3.31) (see Sections 3.3 and 3.4). and respectively. If we use the simplest tau-leap scheme given in [20], then reaction firings can be generated as
[TABLE]
for , where the function generates an independent Poisson random variable with mean . Once we have the reaction firings , the state at time is given by and for any intermediate time-point the state can be obtained using the “Poisson bridge” interpolation (see [29]). However this interpolation approach is equivalent to setting and , where and are reaction firing vectors generated according to (3.36) with replaced by and respectively. This idea can be easily generalized to obtain the interpolated states at intermediate times sorted in ascending order, i.e. .
Let denote the tau-leap process approximating the reaction dynamics with initial state . Our first task is to select the normalization parameter according to (3.35), by estimating the expectation in the formula using simulations of the process . This is done using the function
(see Algorithm 2 in Appendix 5.2) where is the expected number of auxiliary paths (3.32)-(3.33) that need to be simulated (see Section 3.4). Once is chosen, a single realization of can be computed using (Algorithm 1). This method simulates the tau-leap process and at each leap-time , the following happens:
The next leap size () is chosen and the positive integer () is computed. 2. 2.
The intermediate time-points -s are generated for and sorted in ascending order. 3. 3.
For each , the vector of reaction firings for the time-interval is computed and the interpolated state at time is evaluated. Then for each reaction the following happens:
- •
The variables (), () and () are generated. The function generates an independent Bernoulli random variable with expectation .
- •
If then (see (3.26)) is evaluated using
(see Algorithm 3 in
Appendix 5.2) and the sample value is updated according to (3.31). This method independently simulates the pair of processes specified by the split-coupling (3.32)-(3.33) in order to compute . For simplicity we assume that these simulations are carried out by the same tau-leap scheme which generates reaction firings according to (3.36). 4. 4.
Finally, time is updated to , reaction firings for the time-interval are computed and the state is updated accordingly.
Note that in the computation of reaction firings the propensities are evaluated at rather than any of the interpolated states .
4 Numerical Examples
In this section we computationally compare six sensitivity estimation methods on many examples. The methods we consider are the following:
Tau Integral Path Algorithm or IPA: This is the method described in Section 3.5. The tau-leap scheme we use is the simple Euler method [20] with Poisson reaction firings (3.36) and uniform step-size . To avoid the possibility of leaping-over the final time at which the sensitivity is to be estimated, we set
[TABLE]
The value of will depend on the example being considered and the default value of parameter is . 2. 2.
Exact Integral Path Algorithm or eIPA: This is the method we obtain by replacing the tau-leap simulations in IPA with the exact simulations performed with Gillespie’s SSA [19]. This replacement can be easily made by choosing the step-size and the reaction firings according to Remark 3.1. Moreover we need to change the method EvaluateCoupledDifference to the version given in [26]. Note that eIPA is a new unbiased method for estimating parameter sensitivity, like the methods in Section 2.2. This method is conceptually similar to PPA [26], but unlike PPA, the formula (3.18) underlying IPA does not involve summation over the jumps of the process, which makes it more amenable for incorporating tau-leap schemes. 3. 3.
Exact Coupled Finite Difference or eCFD: This is same as the CFD method in [2]. 4. 4.
Exact Common Reaction Paths or eCRP: This is same as the CRP method in [40]. 5. 5.
Tau Coupled Finite Difference or CFD: This method is the tau-leap version of CFD which has been proposed in [51]. Let be the pair of tau-leap processes that approximate the processes , and suppose that at leap time their state is . If the next step-size is , then for every reaction , we set the number of firings for this pair of processes as and , where . Such a selection of reaction firings emulates the CFD coupling. To facilitate comparison, we choose the tau-leap simulation method to be the same as for IPA. 6. 6.
Tau Common Reaction Paths or CRP: This method can be viewed as the tau-leap version of CRP where the CRP coupling is emulated by coupling the Poisson random variables that generate the reaction firings. Using the same notation as before, if and the next step-size is , then we set the number of firings as and for every reaction . Here we assume that there are parallel streams of independent random variables (see [40]), and the method uses the uniform random variable from the -th stream for generating the Poisson random variable with mean . As for CFD, the tau-leap simulation method is the same as for IPA.
In all the finite-difference schemes, we use perturbation-size and we center the parameter perturbations to obtain better accuracy. This centering can be easily achieved by substituting with and with in the expression (2.11) and also in the definition of the coupled processes. Since we use Poisson random variables to generate the reaction firings for tau-leap simulations, it is possible that some state-components become negative during the simulation run. In this paper we deal with this problem rather crudely by setting the negative state-components to [math]. We have checked that this does not cause a significant loss of accuracy because the state-components become negative very rarely.
Note that among the methods considered here, eIPA is the only unbiased sensitivity estimation method. All the other methods are biased either due to a finite-difference approximation of the derivative (eCFD and eCRP) or due to tau-leap approximation of the sample paths (IPA) or due to both these reasons (CFD and CRP). In the examples, we apply each sensitivity estimation method with a sample-size of , and compute the estimator mean (2.6), the standard deviation (2.9), the relative standard deviation and the computational cost per sample (see Section 2). Assume that the exact sensitivity value is which is known. We compare the different estimation methods using the following two quantities - the percentage relative error (RE) defined by
[TABLE]
and the RSD adjusted computational cost (RSDCC) defined by
[TABLE]
The first quantity RE measures the accuracy of a method, while the second quantity RSDCC determines the overall computational time that will be required by the method to yield an estimate with the desired statistical precision (see (2.10)).
Our numerical results will show that the exact schemes (eIPA, eCFD and eCRP) usually have a higher RSDCC than their tau-leap counterparts (IPA, CFD and CRP), but expectedly their RE is lower. Generally the RE for eIPA is smaller than both eCFD and eCRP because of its unbiasedness and this advantage in accuracy often persists when we compare IPA with CFD and CRP. It can be seen that in most of the cases, the sample variance or the estimator standard deviation (2.9), remain of similar magnitude, when we switch from an exact scheme to its tau-leap version (see Appendix 5.2). This supports our claim in Section 2.3, that substituting exact paths with tau-leap trajectories allows one to trade-off bias with computational costs, and this trade-off relationship is somewhat “orthogonal” to other trade-off relationships shown in Table 1.
In all the examples below, the propensity functions -s for all the reactions have the mass-action form [3] unless stated otherwise. Also always denotes the partial-derivative w.r.t. the designated sensitive parameter .
4.1 Single-species birth-death model
Our first example is a simple birth-death model in which a single species is created and destroyed according to the following two reactions:
[TABLE]
Let , and assume that the sensitive parameter is . Let be the Markov process representing the reaction dynamics. Assume that . For we wish to estimate
[TABLE]
for and . For this example, we set . For each we estimate the sensitivity using all the six methods and the results are displayed in Table 3 in Appendix 5.2. For this network we can compute the sensitivity exactly as the propensity functions are affine. These exact values are stated in the caption of Table 3, and they allow us to compute the RE of a method according to (4.37). We also compute the RSDCC333All the computations in this paper were performed using C++ programs on an Apple machine with the 2.9 GHz Intel Core i5 processor. for each method using (4.38), and we compare these RE and RSDCC values for all the methods in Figure 1A. From these comparisons we can make the following observations: 1) The exact methods are typically more accurate than the tau-leap methods but they are usually more computationally demanding. 2) For , eCFD/eCRP are far more accurate than CFD/CRP suggesting that the two sources of bias (finite-difference and tau-leap approximations) are additive in nature. However the same is not true for . 3) For both the cases and , IPA outperforms CFD/CRP in terms of accuracy even though it is slightly more computationally expensive. Same is true when we compare eIPA with eCFD/eCRP.
In Figure 1B we numerically analyze the performance of IPA w.r.t. its two key parameters - the expected number of auxiliary paths and the maximum tau-leap step-size . We see that RE is fairly insensitive to variations in while RSDCC first decreases with up to a certain point, and then it starts increasing with . As we are using a first-order explicit tau-leap scheme, it is unsurprising that RE increases almost linearly with . However, importantly, RSDCC decreases exponentially with , which makes it possible to use tau-leap simulations to trade-off a small amount of accuracy for a large gain in computational efficiency with IPA.
Observe that if we scale the production rate by the system-size or volume parameter , then the concentration process, derived by dividing the copy-number counts by , converges to a deterministic ODE limit as (see Chapter 11 in [14]). Often it is of interest to determine how the performance of various sensitivity estimation methods scales with the volume parameter . We investigate this issue for the exact schemes (eIPA, eCFD and eCRP) in Figure 2, by numerically examining the dependence of their RSD, RSDCC and RE on . Here we set the expected number of auxiliary paths for eIPA to be equal to . Note that RSD for finite-difference schemes (eCFD/eCRP) scales like as was proved in [45] and consequently their RSDCC is of order , because the computational time per sample, which is proportional to the number of reaction events per unit time-interval, is of order . Similar to these finite-difference schemes the RSD for eIPA also scales like , but its RSDCC is of order as its computational time per sample is of order because to generate each sample for eIPA, auxiliary paths need to be simulated in addition to the main sample path. This computational disadvantage of eIPA is compensated by the fact that accuracy of eIPA improves with volume (i.e. RE decreases with volume), while for the finite-difference schemes it is almost a constant. These numerical results suggest that the computational efficiency of eIPA scales with volume in the same way as it does for the CGT method (see Section 2.2) whose RSD has been shown to be of order w.r.t. volume (see [45]). Despite this similarity in volume scaling, eIPA is still a preferable unbiased method when compared to the CGT method, as its estimator variance does not become unbounded as the magnitude of the sensitive parameter approaches zero (see Section 2.2). The volume-scaling analysis presented here can also be performed for the tau-leap schemes by parameterizing the step-size by volume as discussed in Section 3.3. We expect the results to be qualitatively similar to the exact schemes, because, as mentioned previously, it is observed that the sample variance remains similar when we switch from an exact scheme to its tau-leap version (see Appendix 5.2). However this needs to be investigated in detail in a future work.
4.2 Repressilator Network
Our second example considers the Repressilator network given in [13], which consists of three mutually repressing gene-expression modules (say 1,2 and 3). Repression occurs at the level of transcription, i.e. production of the three mRNAs , and , and it is carried out by the corresponding protein molecules , and in a cyclic pattern. In other words, protein represses the transcription of mRNA , where we identify with . The repression mechanism is modeled with a nonlinear Hill function. The repressilator network consists of biomolecular species and reactions described in Table 2.
We set the Hill coefficient for the transcription of each mRNA to be (see reactions 1-3 in Table 2) and the degradation rate constant for each protein to be (see reactions 10-12 in Table 2). Let be the -valued Markov process representing the reaction dynamics, under the species ordering described in the caption of Table 2. We assume that and define by . At , our goal is to estimate
[TABLE]
for . These values measure the sensitivity of the mean of protein population at time with respect to the Hill coefficients -s and the protein degradation rates -s. For this example, we set .
For each we estimate the sensitivity using all the six methods and the results are displayed in Table 5 in Appendix 5.2. Unlike the previous example, we cannot compute the sensitivity values exactly because of nonlinearity of some of the propensity functions. So we obtain accurate approximations of these values using the unbiased estimator (eIPA) with a large sample size () and they are provided in the caption of Table 5. With these values we can compute the REs (4.37), which are then compared along with RSDCCs for all the methods in Figure 3. The results vary with the choice of the sensitive parameter , but one can clearly see that IPA can be several times more accurate than CFD /CRP even though its RSDCC is of a similar magnitude. This is especially observable for cases and . Most notably for the case , the RE for finite-difference schemes is around , while it is for eIPA and for IPA.
4.3 Genetic toggle switch
As our last example we look at a simple network with nonlinear propensity functions. Consider the network of a genetic toggle switch proposed by Gardner et. al. [17]. This network has two species and that interact through the following four reactions
[TABLE]
where the propensity functions -s are given by
[TABLE]
In the above expressions, and denote the number of molecules of and respectively. We set , , and . Let be the -valued Markov process representing the reaction dynamics with initial state . For and , our goal is to estimate
[TABLE]
for and . In other words, we would like to measure the sensitivity of the mean of the number of molecules at time , with respect to all the model parameters. For this example, we set . We estimate these sensitivities with all the six methods and the results are presented in Table 4 in Appendix 5.2, and in Figure 4A.
As in the previous example, we estimate the true sensitivity values using the unbiased estimator (eIPA) with a large sample size (). These approximate values are given in the caption of Table 4 and they were used in computing the relative errors (4.37) for Figure 4. Here we find that eIPA outperforms eCFD/eCRP both in terms of accuracy and computational efficiency for all the parameters. Similarly IPA is computationally more efficient than CFD/CRP for all the parameters, but except for the case , its accuracy is similar to CFD/CRP. In Figure 4B we numerically examine how the performance of IPA is affected by the parameter , for a couple of cases. As in Section 4.1, we find this effect to be quite small for RE but RSDCC first decreases with and then increases.
5 Conclusions and future work
Estimation of parameter sensitivities for stochastic reaction networks in an important and difficult problem. The main source of difficulty is that all the estimation methods rely on exact simulations of the reaction dynamics performed using Gillespie’s SSA [19] or its variants [18, 4]. It is well-known that these simulation algorithms are computationally very demanding as they track each and every reaction event which can be very cumbersome. This issue represents the main bottleneck in the use of sensitivity analysis for systems modeled as stochastic reaction networks. The aim of this paper is to develop a method, called Tau Integral Path Algorithm (IPA), that feasibly deals with this issue by requiring only approximate tau-leap simulations of the reaction dynamics, and still providing provably accurate estimates for the sensitivity values. This method is based on an explicit integral representation for parameter sensitivity that was derived from the formula given in [25]. Furthermore, by replacing the tau-leap simulation scheme in IPA with an exact simulation scheme like SSA, we obtain a new unbiased method (called eIPA) for sensitivity estimation, that can serve as the natural limit of IPA when the step-size gets smaller and smaller.
Using computational examples we compare IPA with tau-leap versions of the finite-difference schemes [2, 40, 51] that are commonly employed for sensitivity estimation. We find that in many cases, IPA outperforms these tau-leap finite-difference schemes in terms of both accuracy and computational efficiency. This makes IPA an appealing method for sensitivity analysis of stochastic reaction networks, where the exact dynamical simulations are computationally infeasible and tau-leap approximations become necessary.
As we argue in Section 2.3, tau-leap simulations provide a natural way to trade-off estimator bias with gains in computational speed. Therefore it would be of fundamental importance to extend the ideas in this paper and try to maximize the computational gains from tau-leap simulations while sacrificing the minimum amount of accuracy. In this context, we now mention two possible directions for future research. The method we proposed here, IPA, can work with any underlying tau-leap simulation scheme, but for simplicity we examined it with the most basic tau-leap scheme i.e. an explicit Euler method with a constant (deterministic) step-size and Poissonian reaction firings [20]. As this tau-leap scheme has several drawbacks (see [21]), it is very likely that IPA can yield much better results if a more sophisticated tau-leap scheme is employed, possibly with random step-sizes [11, 5, 32], or with Binomial leaps [43] or using implicit step-size selection [36]. We shall explore these issues in a future paper. Note that IPA essentially converts the problem of estimating parameter sensitivities to the problem of estimating a collection of expected values of the process with tau-leap simulations. The latter problem can be efficiently handled using multilevel strategies, where estimators are constructed for a range of -values, and are suitably coupled to simultaneously reduce the estimator’s bias and variance [7, 30, 32]. A promising approach would be to integrate these multilevel estimators with IPA to improve its accuracy and computational efficiency.
Appendix
5.1 Proofs of the main results
Proof.[Proof of Theorem 3.4] Let be the filtration generated by the process
and let be its -th jump time for . We define for convenience. Since the process is constant between consecutive jump times we can write
[TABLE]
where and the last equality holds due to linearity of the expectation operator and the fact that if . Given and , the distribution of the random variable has the cumulative density function given by
[TABLE]
This shows that for any continuous function we have
[TABLE]
where the last relation holds because by applying integration by parts we get
[TABLE]
Taking gives us and therefore
[TABLE]
Substituting this in (5.1) we obtain
[TABLE]
Theorem 2.3 in [25] shows that the sensitivity value can be expressed as
[TABLE]
where
[TABLE]
Using this fact along with (5.1) we obtain
[TABLE]
where
[TABLE]
However relation (5.41) with implies that given and , we have
[TABLE]
Substituting this in the last expression for and using the fact that for all we get
[TABLE]
This completes the proof of this result.
Proof.[Proof of Theorem 3.6] For each define by
and . Without loss of generality, we can assume that there exists a such that
[TABLE]
Then due to Lemma 3.3 we obtain
[TABLE]
where is a constant that depends only on as well as . Lemma 3.3 also shows that
[TABLE]
and , where is again a constant that depends only on and .
From (5.44) and Lemma 3.3 it follows that
[TABLE]
Moreover from (5.43), we get
[TABLE]
and hence using Assumption 2, we obtain
[TABLE]
Note that
[TABLE]
Using (5.45) and (5.46) we obtain the bound
[TABLE]
which proves the theorem.
5.2 Supplementary Tables and Algorithms
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] U. Alon , An introduction to systems biology : design principles of biological circuits , Chapman & Hall/CRC mathematical and computational biology series, Chapman & Hall/CRC, 2007.
- 2[2] D. Anderson , An efficient finite difference method for parameter sensitivities of continuous time markov chains , SIAM: Journal on Numerical Analysis, 50 (2012).
- 3[3] D. Anderson and T. Kurtz , Continuous time Markov chain models for chemical reaction networks , in Design and Analysis of Biomolecular Circuits, H. Koeppl, G. Setti, M. di Bernardo, and D. Densmore, eds., Springer-Verlag, 2011.
- 4[4] D. F. Anderson , A modified next reaction method for simulating chemical systems with time dependent propensities and delays , The Journal of Chemical Physics, 127 (2007), 214107.
- 5[5] D. F. Anderson , Incorporating postleap checks in tau-leaping , The Journal of Chemical Physics, 128 (2008), 054103.
- 6[6] D. F. Anderson, A. Ganguly, and T. G. Kurtz , Error analysis of tau-leap simulation methods , Ann. Appl. Probab., 21 (2011), pp. 2226–2262.
- 7[7] D. F. Anderson and D. J. Higham , Multi-level monte carlo for continuous time markov chains, with applications to biochemical kinetics , SIAM Multiscale Modeling and Simulation, 10 (2012), pp. 146–179.
- 8[8] J. Bascompte , Structure and dynamics of ecological networks , Science, 329 (2010), pp. 765–766.
