Reduction for stochastic biochemical reaction networks with multiscale conservations
Jae Kyoung Kim, Grzegorz A. Rempala, Hye-Won Kang

TL;DR
This paper addresses the challenge of accurately approximating stochastic biochemical reaction networks with multiple timescales and conservation laws by proposing a modified multiscale approximation method.
Contribution
It introduces a novel modification to the existing multiscale approximation approach to improve accuracy in networks with conservation laws and virtual slow species.
Findings
Improved approximation accuracy in complex biochemical networks
Effective handling of conservation laws in stochastic simulations
Enhanced applicability of multiscale methods
Abstract
Biochemical reaction networks frequently consist of species evolving on multiple timescales. Stochastic simulations of such networks are often computationally challenging and therefore various methods have been developed to obtain sensible stochastic approximations on the timescale of interest. One of the rigorous and popular approaches is the multiscale approximation method for continuous time Markov processes. In this approach, by scaling species abundances and reaction rates, a family of processes parameterized by a scaling parameter is defined. The limiting process of this family is then used to approximate the original process. However, we find that such approximations become inaccurate when combinations of species with disparate abundances either constitute conservation laws or form virtual slow auxiliary species. To obtain more accurate approximation in such cases, we propose…
| Reactions | Propensity functions |
|---|---|
| Name | Description | Values & Normalized rates () |
|---|---|---|
| Binding rate constant for to | ||
| Unbinding rate constant for | ||
| Production rate constant for | ||
| Conversion rate constant for to |
| Reaction | Counting processes |
|---|---|
| Reactions | Original & normalized propensity functions |
|---|---|
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene Regulatory Network Analysis · Bioinformatics and Genomic Networks · Microbial Metabolic Engineering and Bioproduction
Reduction for stochastic biochemical reaction networks with multiscale conservations
This pre-print has been accepted for publication in SIAM Multiscale Modeling & Simulation. The final copyedited version of this paper will be available at https://www.siam.org/journals/mms.php.
Jae Kyoung Kim Department of Mathematical Sciences, Korea Advanced Institute of Science and Technology ([email protected])
Grzegorz A. Rempala Division of Biostatistics and Mathematical Biosciences Institute, The Ohio State University ([email protected])
Hye-Won Kang Department of Mathematics and Statistics, University of Maryland, Baltimore County ([email protected])
Abstract
Biochemical reaction networks frequently consist of species evolving on multiple timescales. Stochastic simulations of such networks are often computationally challenging and therefore various methods have been developed to obtain sensible stochastic approximations on the timescale of interest. One of the rigorous and popular approaches is the multiscale approximation method for continuous time Markov processes. In this approach, by scaling species abundances and reaction rates, a family of processes parameterized by a scaling parameter is defined. The limiting process of this family is then used to approximate the original process. However, we find that such approximations become inaccurate when combinations of species with disparate abundances either constitute conservation laws or form virtual slow auxiliary species. To obtain more accurate approximation in such cases, we propose here an appropriate modification of the original method.
1 Introduction
Biochemical reaction networks frequently evolve with disparate timescales. The simulations of the stochastic system describing such multi-scale biochemical reaction networks are extremely slow because the computation is predominantly spent on simulating fast reactions [10, 21, 50, 12]. One approach to resolve this problem is using disparate timescales among species [58, 51, 14]. Fast species regulated by fast reactions will quickly equilibrate to a quasi-steady-state (QSS) while other species (slow species) will continue to evolve slowly on a different timescale (slow timescale). Thus, on the slow timescale, the fast species are assumed in QSS, which is determined by the evolution of slow species. By replacing the fast species with their QSS, we can derive the reduced stochastic system depending solely on the slow species. Such reduced system accurately approximates the slow timescale dynamics of the original full stochastic system with a much lower computational cost.
However, in most systems with nonlinear reactions, deriving the exact QSS is difficult, and thus various approximations for QSS have been proposed [8, 53, 59, 25, 11, 28, 54, 48, 6, 49, 50, 13]. Since typically the accuracy of such approximations has been investigated numerically due to the lack of analytical tools, their validity is difficult to fully establish. Indeed, recent studies have shown the potential inaccuracy of a popular approach based on a deterministically derived QSS (e.g. Michaelis-Menten function) [9, 55, 56, 1, 40, 41]. These results indicate the need for justification of the QSS approximation using theoretical analysis [47, 23, 32].
One method allowing for a rigorous analysis is the multiscale approximation method, which was first introduced in [5] and further developed and systemized in [34]. The method is based on the idea of scaling species abundances, reaction rate constants, and time with a common scaling parameter to define a family of processes indexed by the scaling parameter. The limit of the family is then used to approximate the original process on the timescale of interest. This multiscale approximation method has provided accurate approximate reduced models for various multiscale stochastic biochemical reaction networks, including the complex model of the heat shock response in E. coli [33, 34, 35]. The multiscale approximation method allows for a rigorous analysis of the accuracy of the reduced model using theorems in stochastic analysis such as the law of large numbers and the martingale central limiting theorem [35]. Recently, this method was extended to study the chemical reaction-diffusion networks [52]. The scaling method developed for the multiscale approximation has also been used to derive various tools to study chemical reaction networks having multiscale nature, such as hybrid approximation and its simulation algorithms [19, 20, 29], parameter sensitivity analysis [26, 27], and the error analysis for stochastic numerical schemes [4, 3].
The current paper proposes the modified multiscale approximation method, which leads to accurate approximations for a broader class of multiscale stochastic biochemical reaction networks than the original method. Even though we concentrate, for the sake of simplicity, on two specific examples of networks, our proposed approach is seen to apply more broadly. The paper is organized as follows. In Section 2, we briefly review the procedure of the original stochastic multiscale approximation using an example of the Michales-Menten enzyme kinetics. We also point out that the resulting reduced model does not accurately approximate the original model if the system has conservation laws involving species whose abundances are on disparate scales. To improve the accuracy, we propose a modification for the multiscale approximation method in Section 3. In Section 4, using an example of the genetic oscillatory system, we show that the stochastic multiscale approximation leads to an inaccurate approximation if the approximation uses a slow auxiliary variable, the combination of fast species whose abundances are on disparate scales. On the other hand, for such system, our modified multiscale approximation method leads to an accurate approximation. In Section 5, we summarize our results and discuss future work. The details of our analysis described in the main text are provided in the appendix.
2 Stochastic multiscale approximation method
In this section, we review the multiscale approximation method [5, 33, 34] and describe its limitations under conservation laws involving species with disparate molecular abundances. Consider a Michaelis-Menten enzyme kinetics with a product converting back to substrate [1, 40]. This system consists of four reactions as described in 1(a) and Table 1: a free enzyme () reversibly binds with a substrate () to form a complex () and then the complex irreversibly dissociates into a product () and a free enzyme. The product is assumed to be converted back to the substrate so that the substrate concentration is non-zero at the steady state. Propensity functions corresponding to these four reactions are derived based on the mass action kinetics by defining be the abundance of the species at time (Table 1).
Let be a counting process for the number of occurrences of the reaction up to time defined as
[TABLE]
where are independent unit Poisson processes, and are the propensity functions of the reaction given in Table 1. With these counting processes, we can derive the system of stochastic equations describing the state of :
[TABLE]
In this system, the total numbers of molecules of the substrate () and the enzyme () are conserved over time:
[TABLE]
In the following subsections, we briefly describe how to derive the reduced system approximating the slow-scale dynamics of (2) with the multiscale approximation method [5, 33, 34].
2.1 Deriving the normalized system
The first step of the multiscale approximation method is scaling reaction rate constants, species abundances, and time via a common scaling parameter () to identify the timescale of each species. Here, we choose the value of the scaling parameter as to transform the original reaction rate constants () to the normalized constants () with . The scaling exponents () are chosen so that the normalized reaction rate constants () are of order 1 as presented in Table 2.
Similarly, the scaling exponents () are chosen so that becomes of order 1. Since we are interested in the slow-scale dynamics of the system, we determine based on the steady state values of the ordinary differential equations, which are the large volume limit (i.e. thermodynamic limit) of the stochastic system [43, 22] (1(b)):
[TABLE]
Using these scaling exponents, we define the normalized species abundance on the times of order as
[TABLE]
since we are interested in the dynamics at the timescale of order (1(b)). Then, we derive the counting processes in terms of the normalized rate constants () and the normalized variables () on the timescale of order . For instance, the counting process for the first reaction becomes
[TABLE]
where is the vector whose component is . Here in the second equality, we apply the change of variable , and in the third equality, we define a normalized propensity function as . In a similar way, we derive the counting processes for other reactions in terms of normalized propensity functions (see Table 3). Since is of order 1, we can easily recognize the order of the counting processes in Table 3. The higher order indicates the faster counting process.
By substituting the counting processes in Table 3 into the original stochastic system (2), we obtain the normalized stochastic system for . In this normalized system, we replace now the fixed scaling parameter value with a varying parameter to derive a family of vector-valued processes depending on the parameter :
[TABLE]
The initial conditions for the family of precesses are defined so that as :
[TABLE]
The floor function () is used so that the initial conditions of unnormalized species have integer values (see [34] for details). In the following, we will find the limit of this family of processes as and use it to approximate the slow-scale dynamics of the stochastic system given in (2). Note that this approach is analogous to a singular perturbation approach based on Tikhonov’s theorem [57, 37, 24], which reduces the multiscale deterministic systems by setting a small scaling parameter as [math] in the limit.
2.2 Balance equations
In the family of processes given in (7), the order of the maximum production rates for species is due to the term since is of order 1. The order of the maximum consumption rate is also due to . That is, both maximum production and consumption rates of species have the same scaling exponents as . If the maximum exponent of the production rates is larger than that of the consumption rates, the normalized abundance of the species asymptotically goes to infinity as . In the opposite case, it asymptotically goes to zero in the limit. Thus, when the maximum exponents of production and consumption rates are equal, which is known as the “balance equation”, the limit of normalized species can be nondegenerate [33]. In case when there is a subset of species which do not satisfy the balance equations, their limit will be nondengenerate only for a certain time period, which gives the restriction on the choice of the timescale (see [33, 34] for further details). In our example in (7), all species and their linear combinations satisfy the balance equations. We also show that a nondegenerate limit of exists (see Appendix 2 for details).
2.3 Deriving the average of fast variables and limiting model
For the species in (7), the maximum scaling exponent of the reaction rates and the scaling exponent of species abundance (i.e. ) are all . This indicates that the number of molecules of and its change by reactions are of the same order on the current timescale, and therefore the current slow timescale is a natural timescale for . In other words, is a slow-species in terms of the singular perturbation theory [37]. For other species, is less than the maximum scaling exponents of their reaction rates. Hence, the abundance of these species would fluctuate rapidly by reactions on the current slow timescale, indicating that they are fast species. Due to the rapid fluctuation, these fast species do not have a functional limit. Instead, they are averaged out in the limit as [46, 5, 34]. We now describe how to derive the average values of fast species in the limit.
Using two conservation constraints of the systems (7):
[TABLE]
we can simplify (7) as
[TABLE]
(11)-(12) are closed since and are determined by and from the conservations in (9-10) as follows:
[TABLE]
Because the maximum order of the reaction rate () in (11) is greater than , species is rapidly fluctuating and thus its behavior in (12-13) is averaged out as . To derive the averaged value, we use the law of large numbers for the Poisson process:
[TABLE]
where and is a unit Poisson process. From (15), it follows that
[TABLE]
has the same limit as the following integral:
[TABLE]
Applying this result after dividing (11) by , we get
[TABLE]
as since and go to zero. As in the limit, we get
[TABLE]
Setting the integrand of (16) to zero in the limit and defining , we can derive the averaged value of the fast species () in terms of the slow species () in the limit (see Appendix 1 for the detailed derivation):
[TABLE]
where
[TABLE]
Since as , the averaged value of another fast species () in the limit is also derived from (13) as
[TABLE]
Using this averaged value in the limit and the law of large numbers given in (15), we get the limiting equation of (12):
[TABLE]
Note that this reduced system solely depends on since is determined by from (20). Following the original multiscale approximation method [5, 34], we used of the limiting model to approximate after unnormalizing the species abundance and rescaling back the time as
[TABLE]
The advantage of this approximation is that its error can be estimated using the law of large numbers and the martingale central limiting theorem [44, 45, 18, 35]. In our case, we get since it has been known that [35]. Note that for some means that as where (stochastically bounded). Here, indicates convergence in distribution (i.e. weak convergence).
However, the approximation (22) obtained from the deterministic limiting model (21) cannot capture the fluctuation of . One natural way to resolve this issue is to replace the deterministic reaction terms in (21) by random jump processes with the corresponding propensity functions, which leads to the following stochastic process:
[TABLE]
where
[TABLE]
Note that this stochastic equation is the same as the original one for in (12) except for , which now solely depends on the slow variable as does in (20). Similarly to (22), we can use in (23) to approximate , as .
In Appendix 3, we show that
[TABLE]
where is a standard Brownian motion. Importantly, because , indicating that the new approximation with is more accurate than the deterministic limit in (22). However, the new approximation with still contains a considerable error as illustrated in Fig. 2(a). In consistent with our error analysis in (2.3), the numerically estimated errors also increase as becomes larger considering the fact that (Fig. 2(b) and (c)).
The dependence of errors on indicates that the error seen in Fig. 2 mainly stems from neglecting the species in the approximating process. Specifically, the initial condition of species , , is ignored in the limiting total conserved quantity () of (18) due to the fact that the scaling exponent of () is smaller than other scaling exponents in the conservation constraint (9). For the same reason, is also neglected in the limit of the conservation constraint (20). Since in (20) is used to derive (24), is also neglected in the reduced model (23-24). Therefore, as takes a larger portion of in (3), ignoring in deriving causes a larger error as seen in Fig. 2(b) and (c).
Note that we used one scaling exponent for species abundance of (i.e. ) for simplicity even when its order of magnitude of species abundance changes in time. In such case, is supposed to be adjusted throughout time as suggested in the original multiscale approximation method [33, 34]. Specifically, when as in the case of Fig. 2(c), it is suggested to use for the initial transient period and in the later time. However, with such multiple choices of in time, the approximation process becomes complex since different reduced models will be derived in time and combining their numerical simulations is difficult.
3 Modified multiscale stochastic approximation method
In order to correct the approximate errors seen in Fig. 2, we introduce a modified conservation law of the normalized variables:
[TABLE]
Note that in (9) is replaced by to prevent approximating as 0 in the conservation law when . The limit of the newly derived total conserved quantity among the normalized species is
[TABLE]
In contrast to in (18), does not depend on the fraction of in as the total amount of the substrate, , is fixed — is more natural conservation constant than . By substituting the new conservation constraint into (11-14), we define a new family of stochastic processes:
[TABLE]
Though this new family of processes is different from the one in (11-14), we will use the same notation () for simplicity. Since (28-31) is equivalent to the original normalized system in (7) when , the new family of processes includes the original system. Thus, the limiting model of (28-31) can be used to approximate the original system. To derive the limiting model, we divide (28) by and let to get in the same way as described in the previous section. As , we get . Substituting (30-31) in the equation, we get
[TABLE]
as . Setting the integrand to zero in the limit, we get the following approximation of the averaged value of fast species () with respect to the slow species :
[TABLE]
where (See Appendix 1 for detailed derivation). Using (3) and the law of large numbers in (15), and letting in (29), we get a limiting model for the slow species :
[TABLE]
We convert this deterministic limiting model to the stochastic process as in the previous section:
[TABLE]
where
[TABLE]
Note that in this new approximation, is determined by differently from the previous approximation in (23-24). We again use to approximate of the original model, which is accurate as seen in Fig. 3(a). Furthermore, the new approximation is accurate regardless of the initial condition of (Fig. 3(b) and (c)) in contrast to the previous approximation (Fig. 2).
To investigate the accuracy of the new approximation, we perform the error analysis and obtain the following:
[TABLE]
where is a standard Brownian motion (see Appendix 4 for detailed analysis). In particular, since and the diffusion and drift terms are proportional to , it follows that and thus , which shows the accuracy of the newly reduced model in (35-3). Note that for some means that as , where indicates convergence in distribution (i.e. weak convergence).
4 Multiscale approximation for a genetic oscillatory system
In the previous section, we propose a modified multiscale approximation method that leads to an accurate approximation for the stochastic system with a single steady state. In this section, we apply the same idea to the transcriptional negative feedback loop system, which generates oscillations (Fig. 4 (a)) [39, 40, 42, 38]. This system consists of 9 reactions as described in Table 4: the transcription of mRNA () occurs proportional to active DNA () and then is translated into protein (), which promotes the production of the repressor (). The repressor reversibly binds with to form repressed DNA complex (). Furthermore, , , and degrade. This model is described with the following set of stochastic equations:
[TABLE]
Note that the total number of DNA () is conserved
[TABLE]
To derive the normalized system of (39), we scaled reaction rate constants with : , for and , and for others as seen in Table 4. According to the simulations of the deterministic system, which is the large volume limit of (39), the scaling exponents of the molecular abundance () can be chosen as for and and for other species (Fig. 4 (b)). Using , we define the normalized species abundance at the times of order as .
Using the normalized species () and the normalized reaction rate constants (), we derive the normalized propensity functions (), which are of order 1 as described in Table 4. After replacing the original propensity functions in (39) by the normalized ones, we replace with and obtain a family of vector-valued processes satisfying
[TABLE]
Initial conditions () are defined as done in the previous section (8). For all species, the exponents of the maximum production and consumption rates are the same (i.e. balance equations are satisfied), justifying our choice of the timescale. Note that in the above system the normalized total DNA, , is conserved. In the limit of this conserved relation, will be neglected, and thus all DNA is under repressed status in the limit. Thus, the reduced model with the original multiscale approximation method reaches the steady state rather than oscillates. This example again indicates that the limiting model derived using the original method does not accurately approximate the full model when the system has a conservation among species with disparate scales of molecular abundances. Thus, the modified conservation constraint as described in Section 3 is used as
[TABLE]
and the limit of as is defined as
[TABLE]
Using this modified conservation constraint, we define a new family of stochastic processes, using the same notation () for simplicity:
[TABLE]
Because the maximum scaling exponents of the reaction rates of species and are greater than the scaling exponents of molecular abundance (), and fluctuate rapidly and are averaged out. To derive the average values of these fast variables, we divide (4) by and use the law of large numbers for Poisson process in (15) to get
[TABLE]
as . Note that (46) consists of only the fast variables and and thus, we cannot use (46) to derive the limiting average of the fast variables with respect to the slow variables. To circumvent this problem, we introduce the auxiliary species , as suggested by the original multiscale approximation method [33, 34]. Since the abundance of has the same order as , we get
[TABLE]
so that is of order 1. We now derive the equation for using (4)-(4):
[TABLE]
Note that is used to define , and thus two reaction terms can be combined using the superposition principle of Poisson processes [15]. The process for satisfies the balance equation, and is a slow variable because the maximum scaling exponent of the reaction rates and the scaling exponent for the species abundance are equal as . We substitute (47) into (46) and get
[TABLE]
as . Setting the integrand to zero in the limit, we derive the averaged value of the fast species () in terms of the slow species in the limit ():
[TABLE]
which is equivalent with the limit of (47). (50) with (45) yields the averaged value of the fast species ()
[TABLE]
Using and the law of large number for the Poisson process, we get the limiting model for the slow species. Because the limiting model is deterministic, we convert it to the stochastic system similarly as we did in the previous section:
[TABLE]
Note that is derived from (51). In Fig. 5, we used to approximate as , but as seen from the plots, this approximation is inaccurate. In particular, the reduced model does not generate oscillations with a specific frequency in contrast to the full model (Fig. 5(b))
We wondered whether the inaccuracy of the reduced model (52-55) stems frm the fact that we simply fixed scaling exponents () for and throughout the oscillation as they change between and (Fig. 4). That is, as of and change throughout the oscillation, it might not be appropriate to fix the order of as in Table 4, which is used to derive the equation for the average of fast species (49). However, we find that although the orders of and change, throughout the oscillation. Thus our choice of fixed scaling exponents () for and is not the reason for the inaccuracy of the average of fast species (50) and thus the reduced model seen in Fig. 5.
Instead, we find that the inaccurate approximation of the averaged value of the fast species in (55) is due to the fact that the slow auxiliary species () consists of fast species with disparate abundance scales and thus a fast species () with low scale of abundance is neglected in the limit. Specifically, in (50) is equivalent to approximating by [math] in as . Since is used to derive in (51) and hence in (55), is also neglected in the reduced system given in (52-55), which leads to apparent errors seen in (Fig. 5).
To resolve this problem, we adopt a similar idea to the one used in the previous section because a slow variable, , is considered as a constant on fast timescale and thus (47) can be considered as a conservation law on fast timescale. We re-define as
[TABLE]
which prevents the elimination of as . Though (56) is different from (47), we keep using the notation for simplicity. With this new definition, we get the modified relation of (49):
[TABLE]
as . Setting the integrand to zero in the limit, we get the approximation for the averaged limiting value of as
[TABLE]
where . Using (45), we get
[TABLE]
By using the approximate averaged value () and the law of large numbers, we obtain the modified liming model for the slow species. Since the limiting model is deterministic, as before, we convert it to the following stochastic system.
[TABLE]
Note that this newly derived reduced system is the same as the one in (52-55) except for (61). We used to approximate as . As seen from the simulation (Fig. 6), the reduced model accurately approximates the original full model.
We can often obtain slow auxiliary variables by combining fast variables because fast reactions could cancel each other as seen in (48). These newly derived slow variables play a critical role in deriving the reduced models in the multiscale stochastic approximation method [11, 16, 34]. If the slow normalized auxiliary species are derived as proposed in the original method (47), the constituent fast species of the auxiliary species are ignored in the limit if their scales of abundances () are smaller than those of other constituent fast species. This leads to considerable errors as seen in Fig. 5. On the other hand, our modification of the auxiliary variables given in (56) prevents the fast species with small abundance being neglected in the limit and leads to more accurate approximation as shown in Fig. 6.
5 Conclusion
Cells consist of diverse species whose abundances are on disparate scales. For instance, the concentrations of metabolites vary more than fold in E. coli: the concentration of glutamate and adenosine are about and , respectively [7]. Thus, biochemical reaction networks often have conservation laws involving species with disparate abundance scales. Furthermore, the combination of fast species with disparate abundance scales can also form virtual slow auxiliary species that evolve slowly due to the cancelation of the fast reactions. In such cases, with the original multiscale approximation method, the constitute species with the low abundance are ignored in the conservation constraint or in the auxiliary species of limiting models as shown in (18) or (50). Therefore, the original multiscale approximation method [5, 34] can lead to potential errors in the limiting models as seen in our examples (Fig. 2 and Fig. 5). To address this problem, we proposed here to replace the scaling parameter by the fixed value in the conservation constraints and auxiliary variables as we did in (27) and (56). Using these modified conservation constraints (or auxiliary variables), we redefined the family of the normalized stochastic processes in such a way that its limit provides accurate approximations for the full stochastic systems of the Michaelis-Menten kinetics (Fig. 3) and the genetic oscillator (Fig. 6). This indicates that our modified method is applicable for a broader class of multiscale stochastic biochemical reaction networks than the original method.
When the abundances of species evolve across multiple scales over time, the original mutiscale approximation method may require time-dependent scaling exponent and thus lead to different reduced models over time [33]. In this case, the approximation process becomes complex as it requires combining different reduced models over time. On the other hand, our modified multiscale approximation method using the fixed produces an accurate approximation in our example although some species abundances change over time (Fig. 3(c)). It would be interesting future work whether our modified method is applicable to general systems where the scales of species abundances change over time.
Interestingly, the reduced models obtained using our methods coincide with those derived with the stochastic total quasi-steady state approximation (total QSSA) approach [6, 49, 40, 41]. Therefore, the error analysis used in our work can be also applied to validate the accuracy of the stochastic total QSSA, which has been up until now investigated mostly numerically. Another interesting application of our work can be extension of our method to approximate stochastic reaction-diffusion systems [31, 17, 36, 30, 52].
Appendix 1. Derivation of the spatial averages of fast species in Section 2 and Section 3
From the original full model described in (11-12), we derive a scaled generator of as
[TABLE]
Define an occupational random measure of as
[TABLE]
in the space of measures on such that and is the set of natural number and zero. Denote the space of measures as .
Setting in , we define a martingale
[TABLE]
{ and are relatively compact in and , respectively, where is the space of cadlag functions with values and is the space of measures (see Appendix 2). Therefore, we can set be a limit point of in . Using Lemma 1.5 in [46],
[TABLE]
converges in distribution to
[TABLE]
After dividing by and and letting go to infinity, the above term (65) becomes zero for all . Using Lemma 1.4 in [46], there exists such that , and we get
[TABLE]
with probability one.
Then, the average of fast species () is expressed in terms of the slow species () as
[TABLE]
which is given in the main text (17). Note that is a local-averaging distribution and the Poisson distribution with mean because the limit of in (63) is the infinitesimal generator of the Poisson process. For more details of conditions for averaging, please see Section 5 in [34] and [5].
Next, to derive the approximate averaged value of the fast species (3) of Section 3, we substitute to and to in (Appendix 1. Derivation of the spatial averages of fast species in Section 2 and Section 3) and construct a new martingale corresponding to in (28)
[TABLE]
where is an occupation measure of . and are relatively compact, since and are bounded by and as seen in (27), respectively. Dividing by and taking a limit, we get
[TABLE]
as we derived (66). Differentiating with respect to and replacing the time variable by , the rewritten equation becomes
[TABLE]
where .
We derive an approximate averaged value for in the limit:
[TABLE]
by assuming in the limit. In the Appendix 4, we will show that this assumption does not cause any error up to the order of magnitude we are interest in.
Appendix 2. Relative compactness of { and
Here, we will show that { and in Appendix 1 are relatively compact in and , respectively, where is the space of cadlag functions with values and is the space of measures. Since and as , is bounded for all , and thus is relatively compact. We will show that for and for fixed , there exists such that
[TABLE]
Since , we will show that we can set small enough by choosing an appropriate value for . We have
[TABLE]
If and , we can set small enough and large enough so that both probabilities on the right-hand side become small. Then is stochastically bounded for , and by Lemma 1.1 in [46] is relatively compact. Now, we will show that . Taking the expectation on both sides of the equation for in (7) and rearranging terms, we have
[TABLE]
The right-hand side is bounded since for all , and this converges to as . Note that we showed relative compactness of when . If , we need additional assumption that is stochastically bounded for all .
Appendix 3. Error analysis for in Section 2
To analyze the error of the process of (23) in approximating of the full model in (12) with , we use the technique developed in [2]. To this end, we derive a family of process by replacing in (23) with the parameter as:
[TABLE]
where
[TABLE]
We define so that . In this way, (69-70) with become equivalent to the approximate model in (23-24). Furthermore, as so that in (69) and in (12) of the full model have the same limit in . Since , we define an error between and as
[TABLE]
to get the asymptotic behavior of the error between and of order . To find an approximate value of , we derive a limiting behavior of as . We rewrite the reaction terms for in as the following process, which has the same probability distribution with that in :
[TABLE]
where . Similarly, we rewrite the equation for in (69) as the following process:
[TABLE]
Subtracting (\ref{approx1_ZPN_2_app}) from (\ref{ZPN_2_app}),
[TABLE]
Taking the reaction terms in and subtracting their propensity functions, we define the following martingale
[TABLE]
where . A quadratic variation of the martingale is (cf. [35])
[TABLE]
Define a function for in (13) and in (70) as
[TABLE]
so that and . As , is asymptotic to
[TABLE]
where we use the fact that . Then as , is asymptotic to
[TABLE]
Subtracting and adding the propensity functions and using the fact that , can be rewritten as
[TABLE]
Multiplying (\ref{difference_2_app}) by , we get
[TABLE]
Assuming that as , where implies convergence in distribution (or weak convergence), we get
[TABLE]
and
[TABLE]
Substituting and to and applying the martingale central limit theorem, as , where is a Gaussian process with its quadratic variation
[TABLE]
Therefore, as , converges in distribution to
[TABLE]
where is a standard Brownian motion and thus . Approximating as suggested in [35] and using (71), we obtain
[TABLE]
which indicates that .
Appendix 4. Error analysis for in Section 3
We again use the technique developed in [2] to derive the error between of the approximate model (35) and of the full model (12) with . To this end, we derive a family of the processes by replacing of in (35) by a parameter as:
[TABLE]
where
[TABLE]
Note that since . Then, of (80) when becomes equivalent to of (35). That is, the family of process () includes the approximate process of (35). Since in (20) as , and of the full model in (12) converge to the same limit in (21) as . Since as , we define an error as
[TABLE]
to get the asymptotic behavior of the error of order in .
To find an approximate of , we investigate an asymptotic behaviour of as . As we derived , we derive the following equation after replacing and by and in .
[TABLE]
Using reaction terms in and subtracting them by their propensity functions, define a martingale as
[TABLE]
where . Define
[TABLE]
so that . As we get , is asymptotic to
[TABLE]
Next, we show that
[TABLE]
as . Denoting
[TABLE]
we have
[TABLE]
The second term on the right is of order in . The integral of the first term in becomes
[TABLE]
and this converges to [math] as using and , which shows .
Using and ,
[TABLE]
as . Therefore, using the martingale central limit theorem, as , which is a Gaussian process with its quadratic variation
[TABLE]
where as . As we derive , we can derive an equation for by replacing , , , and with , , , and , respectively. Then, is asymptotically equal to
[TABLE]
Using and , converges in distribution to
[TABLE]
as where is a standard Brownian motion. Again, we approximate as suggested in [35] and thus we get
[TABLE]
Since and diffusion and drift terms are proportional to , , which indicates that .
Acknowledgment We are grateful to the MBI for supporting our attendance at the workshop in 2015, where collaboration for this work began. We also thank Wanmo Kang for valuable discussion. This work was supported by the National Research Foundation of Korea grant N01160447 (JKK), KAIST Research Allowance grant G04150020 (JKK), the TJ Park Science Fellowship of POSCO TJ Park Foundation (JKK), National Science Foundation grant DMS-1318886 (GR), DMS-1620403 (HWK), UMBC KAN3STRT (HWK), and National Science Foundation grant DMS-0931642 to the Mathematical Biosciences Institute (JKK, GR, HWK).
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] A. Agarwal, R. Adams, G. C. Castellani, and H. Z. Shouval , On the precision of quasi steady state assumptions in stochastic dynamics , J. Chem. Phys., 137 (2012).
- 2[2] D. F. Anderson, A. Ganguly, and T. G. Kurtz , Error analysis of tau-leap simulation methods , Ann. Appl. Probab., 21 (2011), pp. 2226–2262.
- 3[3] D. F. Anderson and D. J. Higham , Multilevel monte carlo for continuous time markov chains, with applications in biochemical kinetics , SIAM Multiscale Model. Simul., 10 (2012), pp. 146–179.
- 4[4] D. F. Anderson and M. Koyama , Weak error analysis of numerical methods for stochastic models of population processes , SIAM Multiscale Model. Simul., 10 (2012), pp. 1493–1524.
- 5[5] K. Ball, T. G. Kurtz, L. Popovic, and G. Rempala , Asymptotic analysis of multiscale approximations to reaction networks , Ann. Appl. Probab., 16 (2006), pp. 1925–1961.
- 6[6] D. Barik, M. R. Paul, W. T. Baumann, Y. Cao, and J. J. Tyson , Stochastic simulation of enzyme-catalyzed reactions with disparate timescales , Biophys. J., 95 (2008), pp. 3563–3574.
- 7[7] B. D. Bennett, E. H. Kimball, M. Gao, R. Osterhout, S. J. Van Dien, and J. D. Rabinowitz , Absolute metabolite concentrations and implied enzyme active site occupancy in Escherichia coli , Nat. Chem. Biol., 5 (2009), pp. 593–599.
- 8[8] N. Berglund and B. Gentz , Geometric singular perturbation theory for stochastic differential equations , J. Differential Equations, 191 (2003), pp. 1–54.
