Reduction for stochastic biochemical reaction networks with multiscale   conservations

Jae Kyoung Kim; Grzegorz A. Rempala; Hye-Won Kang

arXiv:1704.05628·q-bio.MN·April 20, 2017·ISCA

Reduction for stochastic biochemical reaction networks with multiscale conservations

Jae Kyoung Kim, Grzegorz A. Rempala, Hye-Won Kang

PDF

Open Access

TL;DR

This paper addresses the challenge of accurately approximating stochastic biochemical reaction networks with multiple timescales and conservation laws by proposing a modified multiscale approximation method.

Contribution

It introduces a novel modification to the existing multiscale approximation approach to improve accuracy in networks with conservation laws and virtual slow species.

Findings

01

Improved approximation accuracy in complex biochemical networks

02

Effective handling of conservation laws in stochastic simulations

03

Enhanced applicability of multiscale methods

Abstract

Biochemical reaction networks frequently consist of species evolving on multiple timescales. Stochastic simulations of such networks are often computationally challenging and therefore various methods have been developed to obtain sensible stochastic approximations on the timescale of interest. One of the rigorous and popular approaches is the multiscale approximation method for continuous time Markov processes. In this approach, by scaling species abundances and reaction rates, a family of processes parameterized by a scaling parameter is defined. The limiting process of this family is then used to approximate the original process. However, we find that such approximations become inaccurate when combinations of species with disparate abundances either constitute conservation laws or form virtual slow auxiliary species. To obtain more accurate approximation in such cases, we propose…

Tables4

Table 1. Table 1: Reactions and propensity functions of the Michaelis-Menten kinetics with a convertible product

Reactions	Propensity functions
$S + E \overset{κ_{1}^{'}}{\to} C$	$λ_{1}^{'} (X) := κ_{1}^{'} X_{S} X_{E}$
$C \overset{κ_{2}^{'}}{\to} S + E$	$λ_{2}^{'} (X) := κ_{2}^{'} X_{C}$
$C \overset{κ_{3}^{'}}{\to} P + E$	$λ_{3}^{'} (X) := κ_{3}^{'} X_{C}$
$P \overset{κ_{4}^{'}}{\to} S$	$λ_{4}^{'} (X) := κ_{4}^{'} X_{P}$

Table 2. Table 2: Normalized reaction rate constants

Name	Description	Values & Normalized rates ( $κ_{i}$ )
$κ_{1}^{'}$	Binding rate constant for $E$ to $S$	$0.017 / s = 10^{- 2} \times 1.7 / s = : N_{0}^{- 2} κ_{1}$
$κ_{2}^{'}$	Unbinding rate constant for $C$	$0.03 / s = 10^{- 2} \times 3 / s = : N_{0}^{- 2} κ_{2}$
$κ_{3}^{'}$	Production rate constant for $P$	$0.0016 / s = 10^{- 3} \times 1.6 / s = : N_{0}^{- 3} κ_{3}$
$κ_{4}^{'}$	Conversion rate constant for $P$ to $S$	$0.0007 / s = 10^{- 3} \times 0.7 / s = : N_{0}^{- 3} κ_{4}$

Table 3. Table 3: Counting processes for the normalized system

Reaction	Counting processes
$S + E \overset{N_{0}^{- 2} κ_{1}}{\to} C$	$R_{1}^{t} (N_{0}^{2} λ_{1} (Z^{N_{0}})) := Y_{1} (\int_{0}^{t} N_{0}^{2} κ_{1} Z_{S}^{N_{0}} (u) Z_{E}^{N_{0}} (u) 𝑑 u)$
$C \overset{N_{0}^{- 2} κ_{2}}{\to} S + E$	$R_{2}^{t} (N_{0}^{2} λ_{2} (Z^{N_{0}})) := Y_{2} (\int_{0}^{t} N_{0}^{2} κ_{2} Z_{C}^{N_{0}} (u) 𝑑 u)$
$C \overset{N_{0}^{- 3} κ_{3}}{\to} P + E$	$R_{3}^{t} (N_{0}^{1} λ_{3} (Z^{N_{0}})) := Y_{3} (\int_{0}^{t} N_{0}^{1} κ_{3} Z_{C}^{N_{0}} (u) 𝑑 u)$
$P \overset{N_{0}^{- 3} κ_{4}}{\to} S$	$R_{4}^{t} (N_{0}^{1} λ_{4} (Z^{N_{0}})) := Y_{4} (\int_{0}^{t} N_{0}^{1} κ_{4} Z_{P}^{N_{0}} (u) 𝑑 u)$

Table 4. Table 4: Reactions and propensity functions

Reactions	Original & normalized propensity functions
$D_{A} \overset{κ_{1}^{'}}{\to} D_{A} + M$	$λ_{1}^{'} (X) := κ_{1}^{'} X_{D_{A}} = N_{0}^{2} κ_{1} Z_{D_{A}}^{N_{0}} = : N_{0}^{2} λ_{1} (Z^{N_{0}})$
$M \overset{κ_{2}^{'}}{\to} ϕ$	$λ_{2}^{'} (X) := κ_{2}^{'} X_{M} = N_{0}^{2} κ_{2} Z_{M}^{N_{0}} = : N_{0}^{2} λ_{2} (Z^{N_{0}})$
$M \overset{κ_{3}^{'}}{\to} M + P$	$λ_{3}^{'} (X) := κ_{3}^{'} X_{M} = N_{0}^{2} κ_{3} Z_{M}^{N_{0}} = : N_{0}^{2} λ_{3} (Z^{N_{0}})$
$P \overset{κ_{4}^{'}}{\to} ϕ$	$λ_{4}^{'} (X) := κ_{4}^{'} X_{P} = N_{0}^{2} κ_{4} Z_{P}^{N_{0}} = : N_{0}^{2} λ_{4} (Z^{N_{0}})$
$P \overset{κ_{5}^{'}}{\to} P + R$	$λ_{5}^{'} (X) := κ_{5}^{'} X_{P} = N_{0}^{2} κ_{5} Z_{P}^{N_{0}} = : N_{0}^{2} λ_{5} (Z^{N_{0}})$
$R \overset{κ_{6}^{'}}{\to} ϕ$	$λ_{6}^{'} (X) := κ_{6}^{'} X_{R} = N_{0}^{1} κ_{6} Z_{R}^{N_{0}} = : N_{0}^{1} λ_{6} (Z^{N_{0}})$
$D_{R} \overset{κ_{7}^{'}}{\to} D_{A}$	$λ_{7}^{'} (X) := κ_{7}^{'} X_{D_{R}} = N_{0}^{2} κ_{7} Z_{D_{R}}^{N_{0}} = : N_{0}^{2} λ_{7} (Z^{N_{0}})$
$D_{A} + R \overset{κ_{8}^{'}}{\to} D_{R}$	$λ_{8}^{'} (X) := κ_{8}^{'} X_{D_{A}} X_{R} = N_{0}^{4} κ_{8} Z_{D_{A}}^{N_{0}} Z_{R}^{N_{0}} = : N_{0}^{4} λ_{8} (Z^{N_{0}})$
$D_{R} \overset{κ_{9}^{'}}{\to} D_{A} + R$	$λ_{9}^{'} (X) := κ_{9}^{'} X_{D_{R}} = N_{0}^{4} κ_{9} Z_{D_{R}}^{N_{0}} = : N_{0}^{4} λ_{9} (Z^{N_{0}})$

Equations233

R_{k}^{t} (λ_{k}^{'} (X)) := Y_{k} (\int_{0}^{t} λ_{k}^{'} (X (s)) d s),

R_{k}^{t} (λ_{k}^{'} (X)) := Y_{k} (\int_{0}^{t} λ_{k}^{'} (X (s)) d s),

X_{S} (t) X_{E} (t) X_{C} (t) X_{P} (t) = X_{S} (0) + R_{2}^{t} (λ_{2}^{'} (X)) + R_{4}^{t} (λ_{4}^{'} (X)) - R_{1}^{t} (λ_{1}^{'} (X)), = X_{E} (0) + R_{2}^{t} (λ_{2}^{'} (X)) + R_{3}^{t} (λ_{3}^{'} (X)) - R_{1}^{t} (λ_{1}^{'} (X)), = X_{C} (0) + R_{1}^{t} (λ_{1}^{'} (X)) - R_{2}^{t} (λ_{2}^{'} (X)) - R_{3}^{t} (λ_{3}^{'} (X)), = X_{P} (0) + R_{3}^{t} (λ_{3}^{'} (X)) - R_{4}^{t} (λ_{4}^{'} (X)) .

X_{S} (t) X_{E} (t) X_{C} (t) X_{P} (t) = X_{S} (0) + R_{2}^{t} (λ_{2}^{'} (X)) + R_{4}^{t} (λ_{4}^{'} (X)) - R_{1}^{t} (λ_{1}^{'} (X)), = X_{E} (0) + R_{2}^{t} (λ_{2}^{'} (X)) + R_{3}^{t} (λ_{3}^{'} (X)) - R_{1}^{t} (λ_{1}^{'} (X)), = X_{C} (0) + R_{1}^{t} (λ_{1}^{'} (X)) - R_{2}^{t} (λ_{2}^{'} (X)) - R_{3}^{t} (λ_{3}^{'} (X)), = X_{P} (0) + R_{3}^{t} (λ_{3}^{'} (X)) - R_{4}^{t} (λ_{4}^{'} (X)) .

X_{S_{T}}

X_{S_{T}}

X_{E_{T}}

α_{S} = 0, α_{E} = 1, α_{C} = 1, α_{P} = 1.

α_{S} = 0, α_{E} = 1, α_{C} = 1, α_{P} = 1.

Z_{i}^{N_{0}} (t)

Z_{i}^{N_{0}} (t)

Y_{1} (\int_{0}^{N_{0}^{3} t} λ_{1}^{'} (X (s)) d s) = Y_{1} (\int_{0}^{N_{0}^{3} t} κ_{1}^{'} X_{S} (s) X_{E} (s) d s) = Y_{1} (\int_{0}^{t} (N_{0}^{- 2} κ_{1}) Z_{S}^{N_{0}} (u) (N_{0} Z_{E}^{N_{0}} (u)) N_{0}^{3} d u) =: Y_{1} (\int_{0}^{t} N_{0}^{2} λ_{1} (Z^{N_{0}} (u)) d u),

Y_{1} (\int_{0}^{N_{0}^{3} t} λ_{1}^{'} (X (s)) d s) = Y_{1} (\int_{0}^{N_{0}^{3} t} κ_{1}^{'} X_{S} (s) X_{E} (s) d s) = Y_{1} (\int_{0}^{t} (N_{0}^{- 2} κ_{1}) Z_{S}^{N_{0}} (u) (N_{0} Z_{E}^{N_{0}} (u)) N_{0}^{3} d u) =: Y_{1} (\int_{0}^{t} N_{0}^{2} λ_{1} (Z^{N_{0}} (u)) d u),

Z_{S}^{N} (t) Z_{E}^{N} (t) Z_{C}^{N} (t) Z_{P}^{N} (t) = Z_{S}^{N} (0) + R_{2}^{t} (N^{2} λ_{2} (Z^{N})) + R_{4}^{t} (N λ_{4} (Z^{N})) - R_{1}^{t} (N^{2} λ_{1} (Z^{N})), = Z_{E}^{N} (0) + N^{- 1} (R_{2}^{t} (N^{2} λ_{2} (Z^{N})) + R_{3}^{t} (N λ_{3} (Z^{N})) - R_{1}^{t} (N^{2} λ_{1} (Z^{N}))), = Z_{C}^{N} (0) + N^{- 1} (R_{1}^{t} (N^{2} λ_{1} (Z^{N})) - R_{2}^{t} (N^{2} λ_{2} (Z^{N})) - R_{3}^{t} (N λ_{3} (Z^{N}))), = Z_{P}^{N} (0) + N^{- 1} (R_{3}^{t} (N λ_{3} (Z^{N})) - R_{4}^{t} (N λ_{4} (Z^{N}))) .

Z_{S}^{N} (t) Z_{E}^{N} (t) Z_{C}^{N} (t) Z_{P}^{N} (t) = Z_{S}^{N} (0) + R_{2}^{t} (N^{2} λ_{2} (Z^{N})) + R_{4}^{t} (N λ_{4} (Z^{N})) - R_{1}^{t} (N^{2} λ_{1} (Z^{N})), = Z_{E}^{N} (0) + N^{- 1} (R_{2}^{t} (N^{2} λ_{2} (Z^{N})) + R_{3}^{t} (N λ_{3} (Z^{N})) - R_{1}^{t} (N^{2} λ_{1} (Z^{N}))), = Z_{C}^{N} (0) + N^{- 1} (R_{1}^{t} (N^{2} λ_{1} (Z^{N})) - R_{2}^{t} (N^{2} λ_{2} (Z^{N})) - R_{3}^{t} (N λ_{3} (Z^{N}))), = Z_{P}^{N} (0) + N^{- 1} (R_{3}^{t} (N λ_{3} (Z^{N})) - R_{4}^{t} (N λ_{4} (Z^{N}))) .

Z_{S}^{N} (0) Z_{i}^{N} (0) = Z_{S}^{N_{0}} (0) = X_{S} (0), = \frac{1}{N} ⌊ N Z_{i}^{N_{0}} (0) ⌋ = \frac{1}{N} ⌊ \frac{N}{N _{0}} X_{i} (0) ⌋, i = E, C, P .

Z_{S}^{N} (0) Z_{i}^{N} (0) = Z_{S}^{N_{0}} (0) = X_{S} (0), = \frac{1}{N} ⌊ N Z_{i}^{N_{0}} (0) ⌋ = \frac{1}{N} ⌊ \frac{N}{N _{0}} X_{i} (0) ⌋, i = E, C, P .

Z_{S_{T}}^{N} :

Z_{S_{T}}^{N} :

Z_{E_{T}}^{N} :

Z_{S}^{N} (t)

Z_{S}^{N} (t)

Z_{P}^{N} (t)

Z_{C}^{N} (t)

Z_{C}^{N} (t)

Z_{E}^{N} (t)

N \to \infty lim x \leq x_{0} sup \frac{Y ( N ^{α} x )}{N ^{α}} - x = 0,

N \to \infty lim x \leq x_{0} sup \frac{Y ( N ^{α} x )}{N ^{α}} - x = 0,

\frac{R _{1}^{t} ( N ^{2} κ _{1} Z _{S}^{N} Z _{E}^{N} )}{N ^{2}}

\frac{R _{1}^{t} ( N ^{2} κ _{1} Z _{S}^{N} Z _{E}^{N} )}{N ^{2}}

\int_{0}^{t} κ_{1} Z_{S}^{N} (u) (Z_{E_{T}}^{N} - Z_{S_{T}}^{N} + \frac{1}{N} Z_{S}^{N} (u) + Z_{P}^{N} (u)) d u .

\int_{0}^{t} κ_{1} Z_{S}^{N} (u) (Z_{E_{T}}^{N} - Z_{S_{T}}^{N} + \frac{1}{N} Z_{S}^{N} (u) + Z_{P}^{N} (u)) d u .

\int_{0}^{t} (κ_{2} Z_{C}^{N} (u) - κ_{1} Z_{S}^{N} (u) (Z_{E_{T}}^{N} - Z_{S_{T}}^{N} + \frac{1}{N} Z_{S}^{N} (u) + Z_{P}^{N} (u))) d u \to 0

\int_{0}^{t} (κ_{2} Z_{C}^{N} (u) - κ_{1} Z_{S}^{N} (u) (Z_{E_{T}}^{N} - Z_{S_{T}}^{N} + \frac{1}{N} Z_{S}^{N} (u) + Z_{P}^{N} (u))) d u \to 0

\int_{0}^{t} (κ_{2} Z_{C}^{N} (u) - κ_{1} Z_{S}^{N} (u) (Z_{E_{T}}^{N} - Z_{S_{T}}^{N} + Z_{P}^{N} (u))) d u

\int_{0}^{t} (κ_{2} Z_{C}^{N} (u) - κ_{1} Z_{S}^{N} (u) (Z_{E_{T}}^{N} - Z_{S_{T}}^{N} + Z_{P}^{N} (u))) d u

= \int_{0}^{t} (κ_{2} (Z_{S_{T}}^{N} - Z_{P}^{N} (u)) - κ_{1} Z_{S}^{N} (u) (Z_{E_{T}}^{N} - Z_{S_{T}}^{N} + Z_{P}^{N} (u))) d u \to 0

\overset{ˉ}{Z}_{S} (t)

\overset{ˉ}{Z}_{S} (t)

Z_{S_{T}}

Z_{S_{T}}

Z_{E_{T}}

\overset{ˉ}{Z}_{C} (s) = Z_{S_{T}} - Z_{P} (s) .

\overset{ˉ}{Z}_{C} (s) = Z_{S_{T}} - Z_{P} (s) .

Z_{P} (t) = Z_{P} (0) + \int_{0}^{t} (κ_{3} \overset{ˉ}{Z}_{C} (s) - κ_{4} Z_{P} (s)) d s .

Z_{P} (t) = Z_{P} (0) + \int_{0}^{t} (κ_{3} \overset{ˉ}{Z}_{C} (s) - κ_{4} Z_{P} (s)) d s .

X_{P} (t) \approx N_{0} Z_{P} (N_{0}^{- 3} t) .

X_{P} (t) \approx N_{0} Z_{P} (N_{0}^{- 3} t) .

Z_{P} (t)

Z_{P} (t)

Z_{C} (t)

Z_{C} (t)

X_{P} (t)

X_{P} (t)

E (t)

Z_{S_{T}}^{N} :

Z_{S_{T}}^{N} :

Z_{S_{T}}

Z_{S_{T}}

Z_{S}^{N} (t)

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGene Regulatory Network Analysis · Bioinformatics and Genomic Networks · Microbial Metabolic Engineering and Bioproduction

Full text

Reduction for stochastic biochemical reaction networks with multiscale conservations

This pre-print has been accepted for publication in SIAM Multiscale Modeling & Simulation. The final copyedited version of this paper will be available at https://www.siam.org/journals/mms.php.

Jae Kyoung Kim Department of Mathematical Sciences, Korea Advanced Institute of Science and Technology ([email protected])

Grzegorz A. Rempala Division of Biostatistics and Mathematical Biosciences Institute, The Ohio State University ([email protected])

Hye-Won Kang Department of Mathematics and Statistics, University of Maryland, Baltimore County ([email protected])

Abstract

Biochemical reaction networks frequently consist of species evolving on multiple timescales. Stochastic simulations of such networks are often computationally challenging and therefore various methods have been developed to obtain sensible stochastic approximations on the timescale of interest. One of the rigorous and popular approaches is the multiscale approximation method for continuous time Markov processes. In this approach, by scaling species abundances and reaction rates, a family of processes parameterized by a scaling parameter is defined. The limiting process of this family is then used to approximate the original process. However, we find that such approximations become inaccurate when combinations of species with disparate abundances either constitute conservation laws or form virtual slow auxiliary species. To obtain more accurate approximation in such cases, we propose here an appropriate modification of the original method.

1 Introduction

Biochemical reaction networks frequently evolve with disparate timescales. The simulations of the stochastic system describing such multi-scale biochemical reaction networks are extremely slow because the computation is predominantly spent on simulating fast reactions [10, 21, 50, 12]. One approach to resolve this problem is using disparate timescales among species [58, 51, 14]. Fast species regulated by fast reactions will quickly equilibrate to a quasi-steady-state (QSS) while other species (slow species) will continue to evolve slowly on a different timescale (slow timescale). Thus, on the slow timescale, the fast species are assumed in QSS, which is determined by the evolution of slow species. By replacing the fast species with their QSS, we can derive the reduced stochastic system depending solely on the slow species. Such reduced system accurately approximates the slow timescale dynamics of the original full stochastic system with a much lower computational cost.

However, in most systems with nonlinear reactions, deriving the exact QSS is difficult, and thus various approximations for QSS have been proposed [8, 53, 59, 25, 11, 28, 54, 48, 6, 49, 50, 13]. Since typically the accuracy of such approximations has been investigated numerically due to the lack of analytical tools, their validity is difficult to fully establish. Indeed, recent studies have shown the potential inaccuracy of a popular approach based on a deterministically derived QSS (e.g. Michaelis-Menten function) [9, 55, 56, 1, 40, 41]. These results indicate the need for justification of the QSS approximation using theoretical analysis [47, 23, 32].

One method allowing for a rigorous analysis is the multiscale approximation method, which was first introduced in [5] and further developed and systemized in [34]. The method is based on the idea of scaling species abundances, reaction rate constants, and time with a common scaling parameter to define a family of processes indexed by the scaling parameter. The limit of the family is then used to approximate the original process on the timescale of interest. This multiscale approximation method has provided accurate approximate reduced models for various multiscale stochastic biochemical reaction networks, including the complex model of the heat shock response in E. coli [33, 34, 35]. The multiscale approximation method allows for a rigorous analysis of the accuracy of the reduced model using theorems in stochastic analysis such as the law of large numbers and the martingale central limiting theorem [35]. Recently, this method was extended to study the chemical reaction-diffusion networks [52]. The scaling method developed for the multiscale approximation has also been used to derive various tools to study chemical reaction networks having multiscale nature, such as hybrid approximation and its simulation algorithms [19, 20, 29], parameter sensitivity analysis [26, 27], and the error analysis for stochastic numerical schemes [4, 3].

The current paper proposes the modified multiscale approximation method, which leads to accurate approximations for a broader class of multiscale stochastic biochemical reaction networks than the original method. Even though we concentrate, for the sake of simplicity, on two specific examples of networks, our proposed approach is seen to apply more broadly. The paper is organized as follows. In Section 2, we briefly review the procedure of the original stochastic multiscale approximation using an example of the Michales-Menten enzyme kinetics. We also point out that the resulting reduced model does not accurately approximate the original model if the system has conservation laws involving species whose abundances are on disparate scales. To improve the accuracy, we propose a modification for the multiscale approximation method in Section 3. In Section 4, using an example of the genetic oscillatory system, we show that the stochastic multiscale approximation leads to an inaccurate approximation if the approximation uses a slow auxiliary variable, the combination of fast species whose abundances are on disparate scales. On the other hand, for such system, our modified multiscale approximation method leads to an accurate approximation. In Section 5, we summarize our results and discuss future work. The details of our analysis described in the main text are provided in the appendix.

2 Stochastic multiscale approximation method

In this section, we review the multiscale approximation method [5, 33, 34] and describe its limitations under conservation laws involving species with disparate molecular abundances. Consider a Michaelis-Menten enzyme kinetics with a product converting back to substrate [1, 40]. This system consists of four reactions as described in 1(a) and Table 1: a free enzyme ( $E$ ) reversibly binds with a substrate ( $S$ ) to form a complex ( $C$ ) and then the complex irreversibly dissociates into a product ( $P$ ) and a free enzyme. The product is assumed to be converted back to the substrate so that the substrate concentration is non-zero at the steady state. Propensity functions corresponding to these four reactions are derived based on the mass action kinetics by defining $X_{i}(t)$ be the abundance of the $i_{th}$ species at time $t$ (Table 1).

Let $R^{t}_{k}(\cdot)$ be a counting process for the number of occurrences of the $k_{th}$ reaction up to time $t$ defined as

[TABLE]

where $Y_{k}$ are independent unit Poisson processes, and $\lambda^{\prime}_{k}(X)$ are the propensity functions of the $k_{th}$ reaction given in Table 1. With these counting processes, we can derive the system of stochastic equations describing the state of $X_{i}(t)$ :

[TABLE]

In this system, the total numbers of molecules of the substrate ( $X_{S_{T}}$ ) and the enzyme ( $X_{E_{T}}$ ) are conserved over time:

[TABLE]

In the following subsections, we briefly describe how to derive the reduced system approximating the slow-scale dynamics of (2) with the multiscale approximation method [5, 33, 34].

2.1 Deriving the normalized system

The first step of the multiscale approximation method is scaling reaction rate constants, species abundances, and time via a common scaling parameter ( $N_{0}$ ) to identify the timescale of each species. Here, we choose the value of the scaling parameter as $N_{0}=10$ to transform the original reaction rate constants ( $\kappa^{\prime}_{i}$ ) to the normalized constants ( $\kappa_{i}$ ) with $\kappa^{\prime}_{i}=N^{\beta_{i}}_{0}\kappa_{i}$ . The scaling exponents ( $\beta_{i}$ ) are chosen so that the normalized reaction rate constants ( $\kappa_{i}$ ) are of order 1 as presented in Table 2.

Similarly, the scaling exponents ( $\alpha_{i}$ ) are chosen so that $X_{i}(t)/N^{\alpha_{i}}_{0}$ becomes of order 1. Since we are interested in the slow-scale dynamics of the system, we determine $\alpha_{i}$ based on the steady state values of the ordinary differential equations, which are the large volume limit (i.e. thermodynamic limit) of the stochastic system [43, 22] (1(b)):

[TABLE]

Using these scaling exponents, we define the normalized species abundance on the times of order $N_{0}^{3}$ as

[TABLE]

since we are interested in the dynamics at the timescale of order $N_{0}^{3}$ (1(b)). Then, we derive the counting processes in terms of the normalized rate constants ( $\kappa_{i}$ ) and the normalized variables ( $Z_{i}^{N_{0}}(t)$ ) on the timescale of order $N_{0}^{3}$ . For instance, the counting process for the first reaction becomes

[TABLE]

where $Z^{N_{0}}$ is the vector whose $i_{th}$ component is $Z_{i}^{N_{0}}$ . Here in the second equality, we apply the change of variable $s=N_{0}^{3}u$ , and in the third equality, we define a normalized propensity function as $\lambda_{1}(Z^{N_{0}})(u):=\kappa_{1}Z^{N_{0}}_{S}(u)Z^{N_{0}}_{E}(u)$ . In a similar way, we derive the counting processes for other reactions in terms of normalized propensity functions (see Table 3). Since $\lambda_{i}(Z^{N_{0}})$ is of order 1, we can easily recognize the order of the counting processes in Table 3. The higher order indicates the faster counting process.

By substituting the counting processes in Table 3 into the original stochastic system (2), we obtain the normalized stochastic system for $Z^{N_{0}}(t)$ . In this normalized system, we replace now the fixed scaling parameter value $N_{0}$ with a varying parameter $N$ to derive a family of vector-valued processes $\{Z^{N}(t)\}$ depending on the parameter $N$ :

[TABLE]

The initial conditions for the family of precesses $\{Z^{N}(t)\}$ are defined so that $Z_{i}^{N}(0)\to Z_{i}^{N_{0}}(0)$ as $N\to\infty$ :

[TABLE]

The floor function ( $\lfloor~{}\rfloor$ ) is used so that the initial conditions of unnormalized species $N^{\alpha_{i}}Z_{i}^{N}(0)$ have integer values (see [34] for details). In the following, we will find the limit of this family of processes as $N\rightarrow\infty$ and use it to approximate the slow-scale dynamics of the stochastic system given in (2). Note that this approach is analogous to a singular perturbation approach based on Tikhonov’s theorem [57, 37, 24], which reduces the multiscale deterministic systems by setting a small scaling parameter as [math] in the limit.

2.2 Balance equations

In the family of processes $\{Z^{N}(t)\}$ given in (7), the order of the maximum production rates for species $S$ is $N^{2}$ due to the term $R^{t}_{2}\left(N^{2}\lambda_{2}(Z^{N})\right)$ since $\lambda_{i}(Z^{N})$ is of order 1. The order of the maximum consumption rate is also $N^{2}$ due to $R^{t}_{1}\left(N^{2}\lambda_{1}(Z^{N})\right)$ . That is, both maximum production and consumption rates of species $S$ have the same scaling exponents as $2$ . If the maximum exponent of the production rates is larger than that of the consumption rates, the normalized abundance of the species asymptotically goes to infinity as $N\to\infty$ . In the opposite case, it asymptotically goes to zero in the limit. Thus, when the maximum exponents of production and consumption rates are equal, which is known as the “balance equation”, the limit of normalized species can be nondegenerate [33]. In case when there is a subset of species which do not satisfy the balance equations, their limit will be nondengenerate only for a certain time period, which gives the restriction on the choice of the timescale (see [33, 34] for further details). In our example in (7), all species and their linear combinations satisfy the balance equations. We also show that a nondegenerate limit of $\{Z^{N}(t)\}$ exists (see Appendix 2 for details).

2.3 Deriving the average of fast variables and limiting model

For the species $P$ in (7), the maximum scaling exponent of the reaction rates and the scaling exponent of species abundance (i.e. $\alpha_{P}$ ) are all $1$ . This indicates that the number of molecules of $P$ and its change by reactions are of the same order on the current timescale, and therefore the current slow timescale is a natural timescale for $P$ . In other words, $P$ is a slow-species in terms of the singular perturbation theory [37]. For other species, $\alpha_{i}$ is less than the maximum scaling exponents of their reaction rates. Hence, the abundance of these species would fluctuate rapidly by reactions on the current slow timescale, indicating that they are fast species. Due to the rapid fluctuation, these fast species do not have a functional limit. Instead, they are averaged out in the limit as $N\to\infty$ [46, 5, 34]. We now describe how to derive the average values of fast species in the limit.

Using two conservation constraints of the systems (7):

[TABLE]

we can simplify (7) as

[TABLE]

(11)-(12) are closed since $Z_{C}^{N}(t)$ and $Z_{E}^{N}(t)$ are determined by $Z_{S}^{N}(t)$ and $Z_{P}^{N}(t)$ from the conservations in (9-10) as follows:

[TABLE]

Because the maximum order of the reaction rate ( $N^{2}$ ) in (11) is greater than $N^{\alpha_{S}}=N^{0}$ , species $S$ is rapidly fluctuating and thus its behavior in (12-13) is averaged out as $N\to\infty$ . To derive the averaged value, we use the law of large numbers for the Poisson process:

[TABLE]

where $\alpha>0,x_{0}>0$ and $Y$ is a unit Poisson process. From (15), it follows that

[TABLE]

has the same limit as the following integral:

[TABLE]

Applying this result after dividing (11) by $N^{2}$ , we get

[TABLE]

as $N\to\infty$ since $Z^{N}_{S}(t)/N^{2}$ and $R^{t}_{4}\left(N\kappa_{4}Z_{P}^{N}\right)/N^{2}$ go to zero. As $Z^{N}_{S}(t)/N\to 0$ in the limit, we get

[TABLE]

Setting the integrand of (16) to zero in the limit and defining $Z_{P}:=\lim_{N\to\infty}Z_{P}^{N}$ , we can derive the averaged value of the fast species ( $\bar{Z}_{S}(t)$ ) in terms of the slow species ( $Z_{P}(t)$ ) in the limit (see Appendix 1 for the detailed derivation):

[TABLE]

where

[TABLE]

Since $\bar{Z}_{S}(s)/N\to 0$ as $N\to\infty$ , the averaged value of another fast species ( $C$ ) in the limit is also derived from (13) as

[TABLE]

Using this averaged value in the limit and the law of large numbers given in (15), we get the limiting equation of (12):

[TABLE]

Note that this reduced system solely depends on $Z_{P}$ since $\bar{Z}_{C}(s)$ is determined by $Z_{P}(s)$ from (20). Following the original multiscale approximation method [5, 34], we used $Z_{P}(t)$ of the limiting model to approximate $X_{P}(t)$ after unnormalizing the species abundance and rescaling back the time as

[TABLE]

The advantage of this approximation is that its error can be estimated using the law of large numbers and the martingale central limiting theorem [44, 45, 18, 35]. In our case, we get $X_{P}(t)=N_{0}Z_{P}(N_{0}^{-3}t)+O(N_{0}^{1/2})$ since it has been known that $\frac{1}{N_{0}}X_{P}(N_{0}^{3}t)-Z_{P}(t)=O\left(N_{0}^{-1/2}\right)$ [35]. Note that $X^{N}-Z^{N}=O(N^{-\beta})$ for some $\beta>0$ means that $N^{\beta}\left(X^{N}(t)-Z^{N}(t)\right)\Rightarrow U(t)$ as $N\to\infty$ where $U(t)=O(1)$ (stochastically bounded). Here, $\Rightarrow$ indicates convergence in distribution (i.e. weak convergence).

However, the approximation (22) obtained from the deterministic limiting model (21) cannot capture the fluctuation of $X_{P}(t)$ . One natural way to resolve this issue is to replace the deterministic reaction terms in (21) by random jump processes with the corresponding propensity functions, which leads to the following stochastic process:

[TABLE]

where

[TABLE]

Note that this stochastic equation is the same as the original one for $Z_{P}^{N_{0}}$ in (12) except for $\mathbb{Z}_{C}(t)$ , which now solely depends on the slow variable $\mathbb{Z}_{P}(t)$ as $\bar{Z}_{C}(s)$ does in (20). Similarly to (22), we can use $\mathbb{Z}_{P}(t)$ in (23) to approximate $X_{P}(t)$ , as $X_{P}(t)\approx N_{0}\mathbb{Z}_{P}(N_{0}^{-3}t)$ .

In Appendix 3, we show that

[TABLE]

where $W$ is a standard Brownian motion. Importantly, $X_{P}(t)=N_{0}\mathbb{Z}_{P}(N_{0}^{-3}t)+O(1)$ because $\mathbb{E}(t)=O(1)$ , indicating that the new approximation with $N_{0}\mathbb{Z}_{P}(N_{0}^{-3}t)$ is more accurate than the deterministic limit in (22). However, the new approximation with $N_{0}\mathbb{Z}_{P}(N_{0}^{-3}t)$ still contains a considerable error as illustrated in Fig. 2(a). In consistent with our error analysis in (2.3), the numerically estimated errors also increase as $\left|X_{S}(0)-\bar{Z}_{S}(s)\right|$ becomes larger considering the fact that $\bar{Z}_{S}(s)\approx 2$ (Fig. 2(b) and (c)).

The dependence of errors on $\left|X_{S}(0)-\bar{Z}_{S}(s)\right|$ indicates that the error seen in Fig. 2 mainly stems from neglecting the species $S$ in the approximating process. Specifically, the initial condition of species $S$ , $X_{S}(0)$ , is ignored in the limiting total conserved quantity ( $Z_{S_{T}}$ ) of (18) due to the fact that the scaling exponent of $S$ ( $\alpha_{S}$ ) is smaller than other scaling exponents in the conservation constraint (9). For the same reason, $\bar{Z}_{S}(s)$ is also neglected in the limit of the conservation constraint (20). Since $\bar{Z}_{C}(s)$ in (20) is used to derive (24), $S$ is also neglected in the reduced model (23-24). Therefore, as $X_{S}(0)$ takes a larger portion of $X_{S_{T}}$ in (3), ignoring $X_{S}(0)$ in deriving $Z_{S_{T}}$ causes a larger error as seen in Fig. 2(b) and (c).

Note that we used one scaling exponent for species abundance of $S$ (i.e. $\alpha_{S}=0$ ) for simplicity even when its order of magnitude of species abundance changes in time. In such case, $\alpha_{S}$ is supposed to be adjusted throughout time as suggested in the original multiscale approximation method [33, 34]. Specifically, when $X_{S}(0)=O(N_{0})$ as in the case of Fig. 2(c), it is suggested to use $\alpha_{S}=1$ for the initial transient period and $\alpha_{S}=0$ in the later time. However, with such multiple choices of $\alpha_{S}$ in time, the approximation process becomes complex since different reduced models will be derived in time and combining their numerical simulations is difficult.

3 Modified multiscale stochastic approximation method

In order to correct the approximate errors seen in Fig. 2, we introduce a modified conservation law of the normalized variables:

[TABLE]

Note that $\frac{1}{N}Z^{N}_{S}(t)$ in (9) is replaced by $\frac{1}{N_{0}}Z^{N}_{S}(t)$ to prevent approximating $Z^{N}_{S}$ as 0 in the conservation law when $N\rightarrow\infty$ . The limit of the newly derived total conserved quantity among the normalized species is

[TABLE]

In contrast to $Z_{S_{T}}$ in (18), $\mathcal{Z}_{S_{T}}$ does not depend on the fraction of $X_{S}(0)$ in $X_{S}(0)+X_{C}(0)+X_{P}(0)$ as the total amount of the substrate, $X_{S_{T}}$ , is fixed — $\mathcal{Z}_{S_{T}}$ is more natural conservation constant than $Z_{S_{T}}$ . By substituting the new conservation constraint into (11-14), we define a new family of stochastic processes:

[TABLE]

Though this new family of processes is different from the one in (11-14), we will use the same notation ( $Z^{N}_{i}(t)$ ) for simplicity. Since (28-31) is equivalent to the original normalized system in (7) when $N=N_{0}$ , the new family of processes includes the original system. Thus, the limiting model of (28-31) can be used to approximate the original system. To derive the limiting model, we divide (28) by $N^{2}$ and let $N\rightarrow\infty$ to get $\int_{0}^{t}\left(\kappa_{2}Z^{N}_{C}(s)+\frac{1}{N}\kappa_{4}Z_{P}^{N}(s)-\kappa_{1}Z^{N}_{S}(s)Z^{N}_{E}(s)\right)\ ds\rightarrow 0$ in the same way as described in the previous section. As $\frac{1}{N}\kappa_{4}Z_{P}^{N}(s)\to 0$ , we get $\int_{0}^{t}\left(\kappa_{2}Z^{N}_{C}(s)-\kappa_{1}Z^{N}_{S}(s)Z^{N}_{E}(s)\right)\ ds\rightarrow 0$ . Substituting (30-31) in the equation, we get

[TABLE]

as $N\rightarrow\infty$ . Setting the integrand to zero in the limit, we get the following approximation of the averaged value of fast species ( $\overline{Z}_{C}$ ) with respect to the slow species $Z_{P}:=\lim_{N\to\infty}Z_{P}^{N}$ :

[TABLE]

where $K_{d}=\frac{\kappa_{2}}{\kappa_{1}}$ (See Appendix 1 for detailed derivation). Using (3) and the law of large numbers in (15), and letting $N\to\infty$ in (29), we get a limiting model for the slow species $P$ :

[TABLE]

We convert this deterministic limiting model to the stochastic process as in the previous section:

[TABLE]

where

[TABLE]

Note that in this new approximation, $\mathcal{Z}_{C}(t)$ is determined by $\mathcal{Z}_{P}(t)$ differently from the previous approximation in (23-24). We again use $N_{0}\mathcal{Z}_{P}(N_{0}^{-3}t)$ to approximate $X_{P}(t)$ of the original model, which is accurate as seen in Fig. 3(a). Furthermore, the new approximation is accurate regardless of the initial condition of $S$ (Fig. 3(b) and (c)) in contrast to the previous approximation (Fig. 2).

To investigate the accuracy of the new approximation, we perform the error analysis and obtain the following:

[TABLE]

where $W$ is a standard Brownian motion (see Appendix 4 for detailed analysis). In particular, since $\mathcal{E}(0)=0$ and the diffusion and drift terms are proportional to $\mathcal{E}(s)$ , it follows that $\mathcal{E}(t)=0$ and thus $X_{P}(t)=N_{0}\mathcal{Z}_{P}(N_{0}^{-3}t)+o(1)$ , which shows the accuracy of the newly reduced model in (35-3). Note that $X^{N}=Z^{N}+o\left(N^{-\beta}\right)$ for some $\beta>0$ means that $N^{\beta}\left(X^{N}(t)-Z^{N}(t)\right)\Rightarrow 0$ as $N\to\infty$ , where $\Rightarrow$ indicates convergence in distribution (i.e. weak convergence).

4 Multiscale approximation for a genetic oscillatory system

In the previous section, we propose a modified multiscale approximation method that leads to an accurate approximation for the stochastic system with a single steady state. In this section, we apply the same idea to the transcriptional negative feedback loop system, which generates oscillations (Fig. 4 (a)) [39, 40, 42, 38]. This system consists of 9 reactions as described in Table 4: the transcription of mRNA ( $M$ ) occurs proportional to active DNA ( $D_{A}$ ) and then $M$ is translated into protein ( $P$ ), which promotes the production of the repressor ( $R$ ). The repressor reversibly binds with $D_{A}$ to form repressed DNA complex ( $D_{R}$ ). Furthermore, $M$ , $P$ , and $R$ degrade. This model is described with the following set of stochastic equations:

[TABLE]

Note that the total number of DNA ( $X_{D_{T}}$ ) is conserved

[TABLE]

To derive the normalized system of (39), we scaled reaction rate constants with $N_{0}=10$ : $\kappa^{\prime}_{1}=N^{1}_{0}\kappa_{1}$ , $\kappa^{\prime}_{i}=N^{2}_{0}\kappa_{i}$ for $i=8$ and $9$ , and $\kappa^{\prime}_{i}=N^{0}_{0}\kappa_{i}$ for others as seen in Table 4. According to the simulations of the deterministic system, which is the large volume limit of (39), the scaling exponents of the molecular abundance ( $\alpha_{i}$ ) can be chosen as $1$ for $X_{D_{A}}$ and $X_{R}$ and $2$ for other species (Fig. 4 (b)). Using $\alpha_{i}$ , we define the normalized species abundance at the times of order $N_{0}^{0}$ as $Z_{i}^{N_{0}}(t):=X_{i}(t)/N_{0}^{\alpha_{i}}$ .

Using the normalized species ( $Z_{i}^{N_{0}}(t)$ ) and the normalized reaction rate constants ( $\kappa_{i}$ ), we derive the normalized propensity functions ( $\lambda_{i}(Z^{N_{0}})$ ), which are of order 1 as described in Table 4. After replacing the original propensity functions in (39) by the normalized ones, we replace $N_{0}$ with $N$ and obtain a family of vector-valued processes $\{Z^{N}(t)\}$ satisfying

[TABLE]

Initial conditions ( $Z^{N}_{i}(0)$ ) are defined as done in the previous section (8). For all species, the exponents of the maximum production and consumption rates are the same (i.e. balance equations are satisfied), justifying our choice of the timescale. Note that in the above system the normalized total DNA, $Z^{N}_{D_{A}}(t)/N+Z^{N}_{D_{R}}(t)$ , is conserved. In the limit of this conserved relation, $Z^{N}_{D_{A}}(t)/N$ will be neglected, and thus all DNA is under repressed status in the limit. Thus, the reduced model with the original multiscale approximation method reaches the steady state rather than oscillates. This example again indicates that the limiting model derived using the original method does not accurately approximate the full model when the system has a conservation among species with disparate scales of molecular abundances. Thus, the modified conservation constraint as described in Section 3 is used as

[TABLE]

and the limit of $Z_{D_{T}}^{N}$ as $N\to\infty$ is defined as

[TABLE]

Using this modified conservation constraint, we define a new family of stochastic processes, using the same notation ( $Z^{N}_{i}(t)$ ) for simplicity:

[TABLE]

Because the maximum scaling exponents of the reaction rates of species $R$ and $D_{R}$ are greater than the scaling exponents of molecular abundance ( $\alpha_{i}$ ), $R$ and $D_{R}$ fluctuate rapidly and are averaged out. To derive the average values of these fast variables, we divide (4) by $N^{2}$ and use the law of large numbers for Poisson process in (15) to get

[TABLE]

as $N\rightarrow\infty$ . Note that (46) consists of only the fast variables $Z_{R}$ and $Z_{D_{R}}$ and thus, we cannot use (46) to derive the limiting average of the fast variables with respect to the slow variables. To circumvent this problem, we introduce the auxiliary species $T=R+D_{R}$ , as suggested by the original multiscale approximation method [33, 34]. Since the abundance of $T$ has the same order as $D_{R}$ , we get

[TABLE]

so that $Z^{N}_{T}(t)$ is of order 1. We now derive the equation for $Z^{N}_{T}(t)$ using (4)-(4):

[TABLE]

Note that $\kappa_{6}=\kappa_{7}=1$ is used to define $\kappa_{10}:=\kappa_{6}=\kappa_{7}$ , and thus two reaction terms can be combined using the superposition principle of Poisson processes [15]. The process for $Z^{N}_{T}(t)$ satisfies the balance equation, and $Z^{N}_{T}(t)$ is a slow variable because the maximum scaling exponent of the reaction rates and the scaling exponent for the species abundance are equal as $2$ . We substitute (47) into (46) and get

[TABLE]

as $N\rightarrow\infty$ . Setting the integrand to zero in the limit, we derive the averaged value of the fast species ( $\bar{Z}_{D_{R}}$ ) in terms of the slow species in the limit ( $Z_{T}(t):=\lim_{N\to\infty}Z^{N}_{T}(t)$ ):

[TABLE]

which is equivalent with the limit of (47). (50) with (45) yields the averaged value of the fast species ( $\bar{Z}_{D_{A}}$ )

[TABLE]

Using $\bar{Z}_{D_{A}}(t)$ and the law of large number for the Poisson process, we get the limiting model for the slow species. Because the limiting model is deterministic, we convert it to the stochastic system similarly as we did in the previous section:

[TABLE]

Note that $\bar{\mathbb{Z}}_{D_{A}}(t)$ is derived from (51). In Fig. 5, we used $\mathbb{Z}_{M}(t)$ to approximate $X_{M}(t)$ as $X_{M}(t)\approx N_{0}^{2}\mathbb{Z}_{M}(t)$ , but as seen from the plots, this approximation is inaccurate. In particular, the reduced model does not generate oscillations with a specific frequency in contrast to the full model (Fig. 5(b))

We wondered whether the inaccuracy of the reduced model (52-55) stems frm the fact that we simply fixed scaling exponents ( $\alpha_{i}=1$ ) for $R$ and $D_{A}$ throughout the oscillation as they change between $N_{0}^{0}$ and $N_{0}$ (Fig. 4). That is, as $\alpha_{i}$ of $R$ and $D_{A}$ change throughout the oscillation, it might not be appropriate to fix the order of $\lambda^{\prime}_{8}=\kappa^{\prime}_{8}X_{D_{A}}X_{R}$ as $N_{0}^{4}$ in Table 4, which is used to derive the equation for the average of fast species (49). However, we find that although the orders of $X_{D_{A}}$ and $X_{R}$ change, $\kappa^{\prime}_{8}X_{D_{A}}X_{R}=O(N^{4}_{0})$ throughout the oscillation. Thus our choice of fixed scaling exponents ( $\alpha_{i}$ ) for $R$ and $D_{A}$ is not the reason for the inaccuracy of the average of fast species (50) and thus the reduced model seen in Fig. 5.

Instead, we find that the inaccurate approximation of the averaged value of the fast species in (55) is due to the fact that the slow auxiliary species ( $T$ ) consists of fast species with disparate abundance scales and thus a fast species ( $R$ ) with low scale of abundance is neglected in the limit. Specifically, $\bar{Z}_{D_{R}}(t)=Z_{T}(t)$ in (50) is equivalent to approximating $N^{-1}Z^{N}_{R}(t)$ by [math] in $Z^{N}_{T}(t)=Z^{N}_{D_{R}}(t)+N^{-1}Z^{N}_{R}(t)$ as $N\to\infty$ . Since $\bar{Z}_{D_{R}}(t)=Z_{T}(t)$ is used to derive $\bar{Z}_{D_{A}}(t)$ in (51) and hence $\mathbb{Z}_{D_{A}}(t)$ in (55), $R$ is also neglected in the reduced system given in (52-55), which leads to apparent errors seen in (Fig. 5).

To resolve this problem, we adopt a similar idea to the one used in the previous section because a slow variable, $Z^{N}_{T}(t)$ , is considered as a constant on fast timescale and thus (47) can be considered as a conservation law on fast timescale. We re-define $Z^{N}_{T}$ as

[TABLE]

which prevents the elimination of $Z^{N}_{R}$ as $N\to\infty$ . Though (56) is different from (47), we keep using the notation $Z^{N}_{T}(t)$ for simplicity. With this new definition, we get the modified relation of (49):

[TABLE]

as $N\rightarrow\infty$ . Setting the integrand to zero in the limit, we get the approximation for the averaged limiting value of $Z_{D_{R}}$ as

[TABLE]

where $K_{d}=\kappa_{9}/\kappa_{8}$ . Using (45), we get

[TABLE]

By using the approximate averaged value ( $\bar{Z}_{D_{A}}$ ) and the law of large numbers, we obtain the modified liming model for the slow species. Since the limiting model is deterministic, as before, we convert it to the following stochastic system.

[TABLE]

Note that this newly derived reduced system is the same as the one in (52-55) except for (61). We used $\mathcal{Z}_{M}(t)$ to approximate $X_{M}(t)$ as $X_{M}(t)\approx N_{0}^{2}\mathcal{Z}_{M}(t)$ . As seen from the simulation (Fig. 6), the reduced model accurately approximates the original full model.

We can often obtain slow auxiliary variables by combining fast variables because fast reactions could cancel each other as seen in (48). These newly derived slow variables play a critical role in deriving the reduced models in the multiscale stochastic approximation method [11, 16, 34]. If the slow normalized auxiliary species are derived as proposed in the original method (47), the constituent fast species of the auxiliary species are ignored in the limit if their scales of abundances ( $\alpha_{i}$ ) are smaller than those of other constituent fast species. This leads to considerable errors as seen in Fig. 5. On the other hand, our modification of the auxiliary variables given in (56) prevents the fast species with small abundance being neglected in the limit and leads to more accurate approximation as shown in Fig. 6.

5 Conclusion

Cells consist of diverse species whose abundances are on disparate scales. For instance, the concentrations of metabolites vary more than $10^{6}$ fold in E. coli: the concentration of glutamate and adenosine are about $10^{2}\mu M$ and $10^{-4}\mu M$ , respectively [7]. Thus, biochemical reaction networks often have conservation laws involving species with disparate abundance scales. Furthermore, the combination of fast species with disparate abundance scales can also form virtual slow auxiliary species that evolve slowly due to the cancelation of the fast reactions. In such cases, with the original multiscale approximation method, the constitute species with the low abundance are ignored in the conservation constraint or in the auxiliary species of limiting models as shown in (18) or (50). Therefore, the original multiscale approximation method [5, 34] can lead to potential errors in the limiting models as seen in our examples (Fig. 2 and Fig. 5). To address this problem, we proposed here to replace the scaling parameter $N$ by the fixed value $N_{0}$ in the conservation constraints and auxiliary variables as we did in (27) and (56). Using these modified conservation constraints (or auxiliary variables), we redefined the family of the normalized stochastic processes in such a way that its limit provides accurate approximations for the full stochastic systems of the Michaelis-Menten kinetics (Fig. 3) and the genetic oscillator (Fig. 6). This indicates that our modified method is applicable for a broader class of multiscale stochastic biochemical reaction networks than the original method.

When the abundances of species evolve across multiple scales over time, the original mutiscale approximation method may require time-dependent scaling exponent $\alpha_{i}$ and thus lead to different reduced models over time [33]. In this case, the approximation process becomes complex as it requires combining different reduced models over time. On the other hand, our modified multiscale approximation method using the fixed $\alpha_{i}$ produces an accurate approximation in our example although some species abundances change over time (Fig. 3(c)). It would be interesting future work whether our modified method is applicable to general systems where the scales of species abundances change over time.

Interestingly, the reduced models obtained using our methods coincide with those derived with the stochastic total quasi-steady state approximation (total QSSA) approach [6, 49, 40, 41]. Therefore, the error analysis used in our work can be also applied to validate the accuracy of the stochastic total QSSA, which has been up until now investigated mostly numerically. Another interesting application of our work can be extension of our method to approximate stochastic reaction-diffusion systems [31, 17, 36, 30, 52].

Appendix 1. Derivation of the spatial averages of fast species in Section 2 and Section 3

From the original full model described in (11-12), we derive a scaled generator of $z=(z_{S},z_{P})$ as

[TABLE]

Define an occupational random measure of $Z_{S}^{N}$ as

[TABLE]

in the space of measures $\nu$ on $\mathbb{Z}^{+}\times[0,\infty)$ such that $\nu(\mathbb{Z}^{+}\times[0,t])=t$ and $\mathbb{Z}^{+}$ is the set of natural number and zero. Denote the space of measures as $\mathcal{L}\equiv\mathcal{L}(\mathbb{Z}^{+})$ .

Setting $f(z)=z_{S}$ in $(\ref{gen})$ , we define a martingale

[TABLE]

{ $Z_{P}^{N}\}$ and $\{\Gamma^{N}\}$ are relatively compact in $D_{\mathbb{R}^{+}}([0,\infty))$ and $\mathcal{L}$ , respectively, where $D_{\mathbb{R}^{+}}([0,\infty))$ is the space of cadlag functions with $\mathbb{R}^{+}$ values and $\mathcal{L}$ is the space of measures (see Appendix 2). Therefore, we can set $(Z_{P},\Gamma)$ be a limit point of $\{(Z_{P}^{N},\Gamma^{N})\}$ in $D_{\mathbb{R}^{+}}([0,\infty))\times\mathcal{L}$ . Using Lemma 1.5 in [46],

[TABLE]

converges in distribution to

[TABLE]

After dividing $(\ref{averaging})$ by $N^{2}$ and and letting $N$ go to infinity, the above term (65) becomes zero for all $t>0$ . Using Lemma 1.4 in [46], there exists $\mu_{(\cdot)}$ such that $\Gamma(dz_{S}\times ds)=\mu_{Z_{P}(s)}(dz_{S})\,ds$ , and we get

[TABLE]

with probability one.

Then, the average of fast species ( $\bar{Z}_{S}$ ) is expressed in terms of the slow species ( $Z_{P}$ ) as

[TABLE]

which is given in the main text (17). Note that $\mu_{Z_{P}(s)}$ is a local-averaging distribution and the Poisson distribution with mean $\bar{Z}_{S}(s)$ because the limit of $A_{N}f(z)/N^{2}$ in (63) is the infinitesimal generator of the Poisson process. For more details of conditions for averaging, please see Section 5 in [34] and [5].

Next, to derive the approximate averaged value of the fast species (3) of Section 3, we substitute $\frac{1}{N}z_{S}$ to $\frac{1}{N_{0}}z_{S}$ and $Z_{S_{T}}^{N}$ to $\mathcal{Z}_{S_{T}}^{N}$ in (Appendix 1. Derivation of the spatial averages of fast species in Section 2 and Section 3) and construct a new martingale corresponding to $Z_{S}^{N}$ in (28)

[TABLE]

where $\Gamma^{N}$ is an occupation measure of $Z_{S}^{N}$ . $\left\{Z_{P}^{N}\right\}$ and $\left\{\Gamma^{N}\right\}$ are relatively compact, since $Z_{P}^{N}$ and $Z_{S}^{N}$ are bounded by $\mathcal{Z}_{S_{T}}^{N}\leq\mathcal{Z}_{S_{T}}$ and $N_{0}\mathcal{Z}_{S_{T}}^{N}\leq N_{0}\mathcal{Z}_{S_{T}}$ as seen in (27), respectively. Dividing $(\ref{aver_hat_app})$ by $N^{2}$ and taking a limit, we get

[TABLE]

as we derived (66). Differentiating with respect to $t$ and replacing the time variable by $s$ , the rewritten equation becomes

[TABLE]

where $K_{d}=\frac{\kappa_{2}}{\kappa_{1}}$ .

We derive an approximate averaged value for $Z_{S}^{N}$ in the limit:

[TABLE]

by assuming $\int_{\mathbb{Z}^{+}}z_{S}^{2}\,{\mu}_{Z_{P}(s)}(dz_{S})\approx(\int_{\mathbb{Z}^{+}}z_{S}\,{\mu}_{Z_{P}(s)}(dz_{S}))^{2}$ in the limit. In the Appendix 4, we will show that this assumption does not cause any error up to the order of magnitude we are interest in.

Appendix 2. Relative compactness of { $Z_{P}^{N}\}$ and $\{\Gamma^{N}\}$

Here, we will show that { $Z_{P}^{N}\}$ and $\{\Gamma^{N}\}$ in Appendix 1 are relatively compact in $D_{\mathbb{R}^{+}}([0,\infty))$ and $\mathcal{L}$ , respectively, where $D_{\mathbb{R}^{+}}([0,\infty))$ is the space of cadlag functions with $\mathbb{R}^{+}$ values and $\mathcal{L}$ is the space of measures. Since $Z_{P}^{N}(t)\leq Z_{S_{T}}^{N}$ and $Z_{S_{T}}^{N}\to Z_{S_{T}}$ as $N\to\infty$ , $Z_{P}^{N}(t)$ is bounded for all $t\in[0,\infty)$ , and thus $\{Z_{P}^{N}(t)\}$ is relatively compact. We will show that for $t\in[0,\infty)$ and for fixed $\delta>0$ , there exists $r$ such that

[TABLE]

Since $\int_{0}^{t}1_{[r,\infty)}\left(Z_{S}^{N}(s)\right)\,ds\leq\int_{0}^{t}\frac{Z_{S}^{N}(s)}{r}\,ds$ , we will show that we can set $P\left(\int_{0}^{t}\frac{Z_{S}^{N}(s)}{r}\,ds>\delta\right)$ small enough by choosing an appropriate value for $r$ . We have

[TABLE]

If $Z_{E}^{N}(0)\neq 0$ and $E\left[\int_{0}^{t}Z_{S}^{N}(s)Z_{E}^{N}(s)\,ds\right]<\infty$ , we can set $\eta$ small enough and $r$ large enough so that both probabilities on the right-hand side become small. Then $Z_{S}^{N}(t)$ is stochastically bounded for $t\in[0,\infty)$ , and by Lemma 1.1 in [46] $\{\Gamma^{N}\}$ is relatively compact. Now, we will show that $E\left[\int_{0}^{t}Z_{S}^{N}(s)Z_{E}^{N}(s)\,ds\right]<\infty$ . Taking the expectation on both sides of the equation for $Z_{C}^{N}(t)$ in (7) and rearranging terms, we have

[TABLE]

The right-hand side is bounded since for all $t$ , $Z_{C}^{N}(t)\leq Z_{E_{T}}^{N}$ and this converges to $Z_{E_{T}}<\infty$ as $N\to\infty$ . Note that we showed relative compactness of $\{\Gamma^{N}\}$ when $Z_{E}^{N}(0)\neq 0$ . If $Z_{E}^{N}(0)=0$ , we need additional assumption that $Z_{S}^{N}(t)$ is stochastically bounded for all $t\in[0,\infty)$ .

Appendix 3. Error analysis for $\mathbb{Z}_{P}$ in Section 2

To analyze the error of the process $\mathbb{Z}_{P}$ of (23) in approximating $Z_{P}^{N_{0}}$ of the full model in (12) with $N=N_{0}$ , we use the technique developed in [2]. To this end, we derive a family of process $\mathbb{Z}_{P}^{N}$ by replacing $N_{0}$ in (23) with the parameter $N$ as:

[TABLE]

where

[TABLE]

We define $\mathbb{Z}_{S_{T}}^{N}\equiv Z_{C}^{N}(0)+Z_{P}^{N}(0)$ so that $\mathbb{Z}_{S_{T}}^{N_{0}}=Z_{S_{T}}$ . In this way, (69-70) with $N=N_{0}$ become equivalent to the approximate model in (23-24). Furthermore, $\mathbb{Z}_{C}^{N}(t)\to\bar{Z}_{C}(t)$ as $N\to\infty$ so that $\mathbb{Z}_{P}^{N}$ in (69) and $Z_{P}^{N}$ in (12) of the full model have the same limit $Z_{P}$ in $(\ref{plim})$ . Since $Z_{P}^{N}(t)-\mathbb{Z}_{P}^{N}(t)\to 0$ , we define an error between $Z_{P}^{N}$ and $\mathbb{Z}_{P}^{N}$ as

[TABLE]

to get the asymptotic behavior of the error between $Z_{P}^{N}$ and $\mathbb{Z}_{P}^{N}$ of order $N^{-1}$ . To find an approximate value of $\mathbb{E}^{N_{0}}(t)$ , we derive a limiting behavior of $\mathbb{E}^{N}$ as $N\to\infty$ . We rewrite the reaction terms for $Z_{P}^{N}$ in $(\ref{znp_reduced})$ as the following process, which has the same probability distribution with that in $(\ref{znp_reduced})$ :

[TABLE]

where $A\wedge B\equiv\min\left(A,B\right)$ . Similarly, we rewrite the equation for $\mathbb{Z}_{P}^{N}$ in (69) as the following process:

[TABLE]

Subtracting $(\ref{approx1_ZPN_2_app})$ from $(\ref{ZPN_2_app})$ ,

[TABLE]

Taking the reaction terms in $(\ref{M1_diff1_app})$ and subtracting their propensity functions, we define the following martingale

[TABLE]

where $\tilde{Y}(u)=Y(u)-u$ . A quadratic variation of the martingale is (cf. [35])

[TABLE]

Define a function for $Z_{C}^{N}$ in (13) and $\mathbb{Z}_{C}^{N}$ in (70) as

[TABLE]

so that $F^{N}\left(Z^{N}(s)\right)=Z_{C}^{N}(s)$ and $\bar{F}^{N}\left(\mathbb{Z}_{P}^{N}(s)\right)=\mathbb{Z}_{C}^{N}(s)$ . As $N\to\infty$ , $\left[\mathbb{M}^{N}\right]_{t}$ is asymptotic to

[TABLE]

where we use the fact that $\left(A-A\wedge B\right)+\left(B-A\wedge B\right)=\left|A-B\right|$ . Then as $N\to\infty$ , $\left[N\cdot\mathbb{M}^{N}\right]_{t}$ is asymptotic to

[TABLE]

Subtracting and adding the propensity functions and using the fact that $\left(A-A\wedge B\right)-\left(B-A\wedge B\right)=\left(A-B\right)$ , $(\ref{M1_diff1_app})$ can be rewritten as

[TABLE]

Multiplying $(\ref{difference_2_app})$ by $N$ , we get

[TABLE]

Assuming that $\mathbb{E}^{N}\Rightarrow\mathbb{E}$ as $N\to\infty$ , where $\Rightarrow$ implies convergence in distribution (or weak convergence), we get

[TABLE]

and

[TABLE]

Substituting $(\ref{F_conv_app})$ and $(\ref{dF_conv_app})$ to $(\ref{MN1_asymptotic_app})$ and applying the martingale central limit theorem, $N\cdot\mathbb{M}^{N}\Rightarrow\mathbb{M}$ as $N\to\infty$ , where $\mathbb{M}$ is a Gaussian process with its quadratic variation

[TABLE]

Therefore, as $N\to\infty$ , $(\ref{E1N_app})$ converges in distribution to

[TABLE]

where $W$ is a standard Brownian motion and thus $\mathbb{E}(t)=O(1)$ . Approximating $\mathbb{E}^{N_{0}}(t)\approx\mathbb{E}(t)$ as suggested in [35] and using (71), we obtain

[TABLE]

which indicates that $X_{P}(t)=N_{0}\mathbb{Z}_{P}(N_{0}^{-3}t)+O(1)$ .

Appendix 4. Error analysis for $\mathcal{Z}_{P}$ in Section 3

We again use the technique developed in [2] to derive the error between $\mathcal{Z}_{P}$ of the approximate model (35) and $Z_{P}^{N_{0}}$ of the full model (12) with $N=N_{0}$ . To this end, we derive a family of the processes $\mathcal{Z}_{P}^{N}$ by replacing $N_{0}$ of $\mathcal{Z}_{P}$ in (35) by a parameter $N$ as:

[TABLE]

where

[TABLE]

Note that $\mathcal{Z}_{C}^{N_{0}}(t)=\mathcal{Z}_{C}(t)$ since $Z_{S_{T}}^{N_{0}}=\mathcal{Z}_{S_{T}}$ . Then, $\mathcal{Z}_{P}^{N}(t)$ of (80) when $N=N_{0}$ becomes equivalent to $\mathcal{Z}_{P}$ of (35). That is, the family of process ( $\mathcal{Z}_{P}^{N}$ ) includes the approximate process $\mathcal{Z}_{P}$ of (35). Since $\mathcal{Z}_{C}^{N}(t)\to\bar{Z}_{C}(t)$ in (20) as $N\to\infty$ , $\mathcal{Z}_{P}^{N}(t)$ and $Z_{P}^{N}(t)$ of the full model in (12) converge to the same limit $Z_{P}(t)$ in (21) as $N\to\infty$ . Since $Z_{P}^{N}-\mathcal{Z}_{P}^{N}\to 0$ as $N\to\infty$ , we define an error as

[TABLE]

to get the asymptotic behavior of the error of order $\frac{1}{N}$ in $Z_{P}^{N}(t)-\mathcal{Z}_{P}^{N}(t)$ .

To find an approximate of $\mathcal{E}^{N_{0}}(t)$ , we investigate an asymptotic behaviour of $\mathcal{E}^{N}$ as $N\to\infty$ . As we derived $(\ref{M1_diff1_app})$ , we derive the following equation after replacing $\mathbb{Z}_{P}^{N}$ and $\mathbb{Z}_{C}^{N}$ by $\mathcal{Z}_{P}^{N}$ and $\mathcal{Z}_{C}^{N}$ in $(\ref{M1_diff1_app})$ .

[TABLE]

Using reaction terms in $(\ref{M2_diff1_app})$ and subtracting them by their propensity functions, define a martingale as

[TABLE]

where $\tilde{Y}(u)=Y(u)-u$ . Define

[TABLE]

so that $\tilde{F}^{N}\left(\mathcal{Z}_{P}^{N}(s)\right)=\mathcal{Z}_{C}^{N}(s)$ . As we get $(\ref{MN1_asymptotic_app})$ , $\left[N\cdot\mathcal{M}^{N}\right]_{t}$ is asymptotic to

[TABLE]

Next, we show that

[TABLE]

as $N\to\infty$ . Denoting

[TABLE]

we have

[TABLE]

The second term on the right is of order $\frac{1}{N}$ in $(\ref{estim_app})$ . The integral of the first term in $(\ref{estim_app})$ becomes

[TABLE]

and this converges to [math] as $N\to\infty$ using $(\ref{slim_eq_app})$ and $(\ref{slim_app})$ , which shows $(\ref{ftil_lim_app})$ .

Using $\tilde{F}^{N}(z_{P})\to F(z_{P})\equiv Z_{S_{T}}-z_{P}$ and $\mathcal{Z}_{P}^{N}\to Z_{P}$ ,

[TABLE]

as $N\to\infty$ . Therefore, using the martingale central limit theorem, $N\cdot\mathcal{M}^{N}\Rightarrow\mathcal{M}$ as $N\to\infty$ , which is a Gaussian process with its quadratic variation

[TABLE]

where $\mathcal{E}^{N}(s)\Rightarrow\mathcal{E}(s)$ as $N\to\infty$ . As we derive $(\ref{E1N_app})$ , we can derive an equation for $\mathcal{E}^{N}(t)$ by replacing $\mathbb{E}^{N}$ , $\mathbb{M}^{N}$ , $\bar{F}^{N}$ , and $\mathbb{Z}_{P}^{N}$ with $\mathcal{E}^{N}$ , $\mathcal{M}^{N}$ , $\tilde{F}^{N}$ , and $\mathcal{Z}_{P}^{N}$ , respectively. Then, $\mathcal{E}^{N}$ is asymptotically equal to

[TABLE]

Using $(\ref{ftil_lim_app})$ and $(\ref{dftil_lim_app})$ , $(\ref{E2N_app})$ converges in distribution to

[TABLE]

as $N\to\infty$ where $W$ is a standard Brownian motion. Again, we approximate $\mathcal{E}^{N_{0}}(t)\approx\mathcal{E}(t)$ as suggested in [35] and thus we get

[TABLE]

Since $\mathcal{E}(0)=0$ and diffusion and drift terms are proportional to $\mathcal{E}(s)$ , $\mathcal{E}(t)=0$ , which indicates that $X_{P}(t)=N_{0}\mathcal{Z}_{P}(N_{0}^{-3}t)+o(1)$ .

Acknowledgment We are grateful to the MBI for supporting our attendance at the workshop in 2015, where collaboration for this work began. We also thank Wanmo Kang for valuable discussion. This work was supported by the National Research Foundation of Korea grant N01160447 (JKK), KAIST Research Allowance grant G04150020 (JKK), the TJ Park Science Fellowship of POSCO TJ Park Foundation (JKK), National Science Foundation grant DMS-1318886 (GR), DMS-1620403 (HWK), UMBC KAN3STRT (HWK), and National Science Foundation grant DMS-0931642 to the Mathematical Biosciences Institute (JKK, GR, HWK).

Bibliography59

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] A. Agarwal, R. Adams, G. C. Castellani, and H. Z. Shouval , On the precision of quasi steady state assumptions in stochastic dynamics , J. Chem. Phys., 137 (2012).
2[2] D. F. Anderson, A. Ganguly, and T. G. Kurtz , Error analysis of tau-leap simulation methods , Ann. Appl. Probab., 21 (2011), pp. 2226–2262.
3[3] D. F. Anderson and D. J. Higham , Multilevel monte carlo for continuous time markov chains, with applications in biochemical kinetics , SIAM Multiscale Model. Simul., 10 (2012), pp. 146–179.
4[4] D. F. Anderson and M. Koyama , Weak error analysis of numerical methods for stochastic models of population processes , SIAM Multiscale Model. Simul., 10 (2012), pp. 1493–1524.
5[5] K. Ball, T. G. Kurtz, L. Popovic, and G. Rempala , Asymptotic analysis of multiscale approximations to reaction networks , Ann. Appl. Probab., 16 (2006), pp. 1925–1961.
6[6] D. Barik, M. R. Paul, W. T. Baumann, Y. Cao, and J. J. Tyson , Stochastic simulation of enzyme-catalyzed reactions with disparate timescales , Biophys. J., 95 (2008), pp. 3563–3574.
7[7] B. D. Bennett, E. H. Kimball, M. Gao, R. Osterhout, S. J. Van Dien, and J. D. Rabinowitz , Absolute metabolite concentrations and implied enzyme active site occupancy in Escherichia coli , Nat. Chem. Biol., 5 (2009), pp. 593–599.
8[8] N. Berglund and B. Gentz , Geometric singular perturbation theory for stochastic differential equations , J. Differential Equations, 191 (2003), pp. 1–54.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Reduction for stochastic biochemical reaction networks with multiscale conservations

Abstract

1 Introduction

2 Stochastic multiscale approximation method

2.1 Deriving the normalized system

2.2 Balance equations

2.3 Deriving the average of fast variables and limiting model

3 Modified multiscale stochastic approximation method

4 Multiscale approximation for a genetic oscillatory system

5 Conclusion

Appendix 1. Derivation of the spatial averages of fast species in Section 2 and Section 3

Appendix 2. Relative compactness of {ZPN}Z_{P}^{N}\}ZPN​} and {ΓN}\{\Gamma^{N}\}{ΓN}

Appendix 3. Error analysis for ZP\mathbb{Z}_{P}ZP​ in Section 2

Appendix 4. Error analysis for ZP\mathcal{Z}_{P}ZP​ in Section 3

Appendix 2. Relative compactness of { $Z_{P}^{N}\}$ and $\{\Gamma^{N}\}$

Appendix 3. Error analysis for $\mathbb{Z}_{P}$ in Section 2

Appendix 4. Error analysis for $\mathcal{Z}_{P}$ in Section 3