Efficient approximations of transcriptional bursting effects on the dynamics of a gene regulatory network
Jochen Kursawe, Antoine Moneyron, Tobias Galla

TL;DR
This paper shows how transcriptional bursting can change gene network dynamics and introduces efficient methods to model it.
Contribution
A novel extension of the chemical Langevin equation to include transcriptional bursting noise is derived.
Findings
Transcriptional bursting can induce or amplify oscillations in gene regulatory networks.
The extended Langevin equation reduces computation time while preserving analytical tractability.
The study provides guidelines for when and how to apply different approximation methods.
Abstract
Mathematical models of gene regulatory networks are widely used to study cell fate changes and transcriptional regulation. When designing such models, it is important to accurately account for sources of stochasticity. However, doing so can be computationally expensive and analytically untractable, posing limits on the extent of our explorations and on parameter inference. Here, we explore this challenge using the example of a simple auto-negative feedback motif, in which we incorporate stochastic variation due to transcriptional bursting and noise from finite copy numbers. We find that transcriptional bursting may change the qualitative dynamics of the system by inducing oscillations when they would not otherwise be present, or by magnifying existing oscillations. We describe multiple levels of approximation for the model in the form of differential equations, piecewise-deterministic…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8|
finite copy numbers (Ω < ∞) |
infinite copy numbers (Ω < ∞) | |
|---|---|---|
|
finite transcriptional bursting parameter ( |
copy number noise and transcriptional bursting noise full model, |
only transcriptional bursting noise piecewise-deterministic dynamics, |
|
infinite transcriptional bursting parameter ( |
only copy-number noise reaction system in |
no noise, delay differential |
- —María de Maeztu Unit of Excellence
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene Regulatory Network Analysis · Evolution and Genetic Dynamics · stochastic dynamics and bifurcation
Introduction
Gene regulatory networks are at the heart of many developmental and physiological processes and are a key component of cell fate decisions [1]. The expression of proteins participating in gene regulatory networks, i.e. transcription factors, is inherently dynamic. A fundamental example of dynamic transcription factor expression is gene expression oscillators, such as the circadian clock [2–5], or rapid ultradian oscillators governing the formation of somites [6]. Rapid ultradian gene oscillators also encode information necessary for cell-state transitions, such as the differentiation of neural progenitor cells into neurons or glial cells [7,8].
It is surprising that dynamic signatures of gene expression, such as oscillations, are correctly established and interpreted when considering the prevalence of stochastic effects [9]. The copy numbers of protein and mRNA molecules in gene regulatory networks are often sufficiently small that simple mathematical approaches based on deterministic rate equations fail to provide even a qualitatively accurate description [10–13]. For small copy numbers, the intrinsic stochasticity of transcription and translation processes and the degradation of protein and mRNA cannot be neglected. This raises the key question whether noise is detrimental to the functioning of gene regulatory processes, or if indeed, stochasticity can be exploited [14]. An example of the latter is stochastic amplification, in which a stochastic dynamical system may show persistent oscillations in circumstances when its corresponding deterministic system does not [15–17].
Gene regulatory networks are subject to other types of stochasticity, beyond copy-number noise. For example, conditions external to the cellular environment may vary in time or cell divisions might be uneven [18,19]. One additional source of randomness is transcriptional bursting, describing the phenomenon that mRNA production is discontinuous in time and occurs in bursts of transcriptional activity. The size and timing of such bursts varies stochastically [20,21].
Mathematical models are a widely used tool to study and identify gene regulatory networks [22–24], and their description through the law of mass action [25–27] can easily be extended to account for copy-number noise through the Gillespie stochastic simulation algorithm [28] or the chemical Langevin equation [29,30]. It is an ongoing challenge to understand how gene regulatory networks integrate other types of noise, such as transcriptional bursting, to robustly establish dynamical signatures like oscillations and cell fate transitions.
To mathematically describe experimental observations on transcriptional bursting of a single gene (i.e. not considering responses within wider gene regulatory networks), the so-called telegraph model has been widely used. In this simplest model of transcriptional bursting, the gene randomly switches between states of active and inactive transcription [31,32]. Considerable effort has been expended to parametrize versions of the telegraph model using experimental data and a recent overview of such approaches is for example available from Luo et al. [33]. Typically, the comparison between models and data is performed using steady-state distributions of mRNA copy numbers, which can be predicted from the model and experimentally measured using scRNA-seq data [34] or smFISH [35]. In some cases, bursting dynamics are directly observed through live-imaging and used to identify parameters of telegraph models and related motifs [36–39].
In addition to these efforts at the level of a single gene, many studies investigate the effects of transcriptional bursting into wider gene regulatory networks. An example is inference approaches that take regulatory interactions into account when inferring properties of transcriptional bursting [40,41]. Recent work by [42] points out that co-fluctuations between transcription factors and their targets may inform inference, and earlier works by [43] and [44] have discussed in detail how molecular processes of transcriptional as well as translation bursting may be incorporated into stochastic descriptions of gene regulatory networks. These efforts often rely on exact simulations of the stochastic dynamics, which can be computationally costly especially when copy numbers of the involved molecules are large. In addition, exact descriptions of the stochastic processes involved are typically not analytically tractable. These shortcomings limit the potential of stochastic simulations in studying gene regulatory networks.
Here, we investigate the delayed auto-negative feedback motif, in which we incorporate the effects of transcriptional bursting as well as noise due to finite copy numbers. Using the example of this motif, we introduce multiple levels of approximations of gene regulatory dynamics valid in the individual or joint limits of large copy numbers or fast bursting. As part of this effort, we extend the classical chemical Langevin equation to account for the Gaussian effects of transcriptional bursting in gene regulatory networks. We show that stochastic variation due to transcriptional bursting can enable oscillations when they are not otherwise present or enhance oscillations if they are already present. Our approximations allow us to analytically determine the power spectrum of the oscillations and investigate for what system sizes and burst frequencies our approximation becomes invalid. Our derivations of the extended chemical Langevin equation are provided in general form. These equations are approximations that drastically reduce computation times and are applicable to any gene regulatory network and many models of transcriptional bursting.
The remainder of the paper is organized as follows. In the model description of §2, we define the model of the auto-repressive network of a single gene, explicitly including finite copy numbers and transcriptional bursting, as well as transcriptional delay. We demonstrate that our model simplifies to previously published versions of the model in the limit of infinitely fast transcriptional bursting and discuss the alternative limiting case where copy-number noise is absent. In the results §3, we first discuss the effect of transcriptional bursting on the system dynamics before presenting our main analytical results. Through a systematic expansion in the inverse noise strength, we are able to calculate the Fourier spectra of oscillations induced by copy-number noise and/or noise due to bursting. We proceed to test these theoretical predictions against simulations and assess the validity of the various reduced models for different regimes of transcriptional bursting and intensity of copy-number noise. We finally include a discussion in §4. Further details of the mathematical analysis are collated in the electronic supplementary material.
Model description
Gene regulatory network with negative autoregulation and transcriptional bursting
2.1.
We start by introducing a simple model of a transcription factor subject to auto-negative feedback (figure 1). Importantly, this model accounts for transcriptional bursting as well as stochastic variation due to low copy numbers. The model extends previous work [17,45] and includes the effects of transcription, translation and degradation of mRNA and protein. Motivated by the classical telegraph model of transcription, the promoter of the gene can assume two states: one state in which transcription is active with a constant rate, and one state in which the transcription does not occur. The promoter switches between these two states stochastically, and the rate of switching into the OFF state depends on the amount of protein present in the system. This simulates the auto-negative feedback, whereby a high abundance of protein can inhibit transcription.
Schematic of the transcriptional auto-repressive feedback loop. Our model includes the effect of a promoter that can be in an active or an inactive state (on/off), as well as the dynamics of mRNA and protein. These are subject to transcription, translation and degradation. The promoter can be inactivated by the presence of protein. The two dashes on the arrow for transcription denote the presence of a transcriptional delay τ.
Mathematically, our model comprises a set of chemical reactions between protein and mRNA molecules and their interactions with the promoter state. The number of each type of particle is discrete. We write for the number of protein molecules and for the number of mRNA molecules in a cell. They relate to their concentrations and via and , where is the nuclear volume. For convenience, we write , where we call the system size in line with chapter 10 of [30]. By varying the size of the nucleus, controls how many molecules may be expected in the system at constant concentrations. Throughout the paper, we will use parameter values identified in [46], and we choose , which is the average nuclear volume in that publication. We will measure concentrations in concentration units ‘cu’, which correspond to one molecule per volume .
Following [12,13], our model is
where and stand for the promoter states in which transcription of the gene are ON and OFF, respectively. We refer to the parameter in the first two lines in equation (2.1) as the bursting parameter. It controls the time scale of switching between the two promoter states. Note that the reaction rate from the promoter OFF into the ON state is in the literature often referred to as burst frequency, which is equal to in our parametrization [21,38]. is a so-called Hill coefficient. We refer to the fixed model parameter as the repression threshold, which controls the amount of protein required to inhibit transcription. The third and fourth reactions describe degradation of mRNA and protein, respectively, with degradation rates and . For convenience, we introduce the mathematical variable to indicate if the promoter is in the ON state ( ) or the OFF state ( ) at any one time . We draw attention to the rate for the penultimate reaction, which describes transcription. This reaction rate indicates that a transcription process can only be initiated at a given time if the promoter is in the ON state ( ). The rate of transcription is given by the basal transcription rate . Note that an alternative way to write the transcription reaction without the use of the variable would be
Here, we instead use the notation in equation (2.1), as the variable will be useful in our calculations.
The resulting mRNA molecule of a transcription event that is triggered at time enters the system at time . This delay is indicated by the double arrow in the reaction system and simulates the time required for transcription and subsequent transport of the mRNA out of the nucleus [45,47,48]. The final reaction describes the translation of mRNA into protein, with rate .
The number of reactions of each type that occur in each time interval is proportional to , or . As a consequence, the copy numbers of mRNA and protein particles in the system are proportional to either the system-size parameter . We can use this parameter to control the strength of copy-number noise in our analysis, as stochasticity in the rate of each reaction will decrease with the system size.
We highlight that our calculations are consistent with other mechanisms that upregulate the expected number of molecules in the system by a factor . In fact, biological cells can have vastly different expression levels for individual proteins without an increased size of the nucleus. Scaling by as in our system can be used to model such increases, although we point out that and in this case are not technically concentrations. Rather, increases in would correspond to an effective change of the unit of concentration towards higher density. In our model, that would be achieved by increasing the basal transcription rate and the repression threshold by a factor , as this would be mathematically equivalent to an increase in volume as before.
We chose the form of the transition rate into the promoter OFF state of . This reaction rate reflects the assumption that, biophysically, a promoter requires binding with protein molecules to switch into its OFF state. This can be seen as follows.
Considering a promoter that can bind protein molecules, the promoter may assume an unbound state and a bound state , where the latter corresponds to the state in which all binding sites are occupied. The reaction kinetics of promoter binding and unbinding are then as follows:
where and are binding constants. The left-hand side describes the unbound state and free protein molecules. On the right-hand side, all protein molecules are bound. In this reaction system, the rate of the reaction to the OFF state is given by
and that of the reaction to the ON state is given by
We now define and . Under these definitions, reaction (2.3) is equivalent to the first two reactions in equation (2.1). In doing so, we parametrize the promoter dynamics such that both transition rates between the OFF and ON states depend on . As a consequence, the mean probability to be in the ON or OFF state does not depend on , keeping the time-averaged transcription rate constant as we vary the bursting parameter . This allows us to isolate effects of promoter dynamics from those of changes in transcription rates.
Our model simplifies the limit of infinitely fast transcriptional bursting
2.2.
We next consider the limit of infinitely fast bursting parameter, . We write and for the probabilities to find the promoter in its unbound and bound states, respectively. Reflecting the limit , we assume that switching between the promoter states is sufficiently fast to reach equilibrium while a given concentration of protein molecules is present in the system. In this equilibrium of promoter binding and unbinding, we have
in addition to
Hence, in this stationary state, the probability of finding the promoter in its bound OFF state is
The probability of finding the promoter in the unbound ON state is
The average transcription rate is then given by
If the promoter switching is fast relative to the remaining reactions, i.e. when , fluctuations around this average value will diminish, and the system can be approximated as
The function is a decreasing function of the protein concentration , reflecting auto-repression and the degree of nonlinearity of the function, i.e. its sensitivity to changes in protein molecule numbers is regulated by the Hill coefficient . As before, the double arrow in equation (2.11) denotes the presence of the transcriptional delay . These reactions were previously introduced by [45], and they have since then been used widely as a template model for auto-repressive feedback [8,12,17,49]. Our model in figure 1 and equation (2.1) is intentionally designed so that the promoter behaves as in the widely used telegraph model for transcriptional bursting, while also achieving the typically used Hill function (2.10) for the transcription rate for the case of infinitely fast bursting. This allows us to analyse how the dynamics of the model (2.11) are modified if transcriptional bursting is taken into account. The description in terms of the Hill function is valid, provided the promoter switches between the ON and OFF states much more quickly than the remaining time scales in the reaction system.
A deterministic description of these dynamics is given by the following rate equations (see also [17,45]) for the concentrations and :
We note the delay term in the first equation. This description does not describe fluctuations due to finite copy numbers or transcriptional bursting and is formally only valid in the joint limit of and .
Limiting cases distinguish sources of noise in the system: copy-number noise versus noise due to transcriptional bursting
2.3.
We can distinguish between two types of randomness in the model in equation (2.1). The first is copy-number noise. By this we mean the intrinsic stochasticity in the protein and mRNA dynamics, described by the last four reactions in (2.1). This randomness is present even at fixed promoter state and becomes more pronounced when there are only small numbers of molecules in the system. More precisely, the amplitude of intrinsic noise scales as for large values of the system-size parameter ([30, chapter 10] or [15] and [16]). This is a consequence of the central limit theorem. Broadly, the mean number of reactions in the system per unit time is proportional to , and the standard deviations of the number of reactions is of order . Relative fluctuations therefore scale as .
The second type of noise arises from transcriptional bursting, induced by the random switching between the ON and OFF states in the first two reactions in (2.1). The promoter dynamics operate on time scales that are set by the bursting parameter . For increasingly pronounced timescale separation, i.e. for very fast transcriptional burst frequencies ( ), the transcriptional bursting effects average out, and thus the effects of transcriptional burstiness diminish. Using again the central limit theorem, the relative fluctuations due to transcriptional bursting scale as . The reactions in equation (2.1) describe the full model, for general values of and . This set-up therefore captures both copy-number noise and noise due to transcriptional bursting.
Having considered the limit , equation (2.11), in which only copy-number noise and no bursting noise is present, and the joint limit , equation (2.12) in which neither noise is present, it remains to cover the limit , in which bursting noise is present, but copy-number noise is not. In this limit, the dynamics of protein and mRNA may be described by ordinary differential equations. However, for general values of , the promoter dynamics remain stochastic. This leads to a description similar to what is known as a piecewise-deterministic Markov process [50,51]. Specifically, in this limit one has
for the dynamics of mRNA and protein. The switching of the promoter state is as before, that is, transitions from to occur with the bursting parameter , and the promoter switches off ( ) at time with rate . We refer to equation (2.13) as the equation for the piecewise-deterministic pseudo-Markov process. This approximation has recently been analysed in the stationary regime [52].
We note the term in the first equation in (2.13). This describes the process of mRNA production, subject to transcription delay. Production of mRNA molecules at time only occurs if the promoter was in the ON state ( ) at time . If on the other hand , then no new mRNA molecules enter the system at time .
The different descriptions of the system are summarized in table 1.
Results
Oscillatory gene expression can be induced and amplified by transcriptional bursting
3.1.
Our model (2.1) extends the widely used model of auto-negative feedback (2.11) to take transcriptional bursting into account. This allows us to ask how transcriptional bursting may affect the dynamics of the feedback loop. To this end, we simulate the full model (2.1), including bursting noise and copy-number noise, as well as the limiting cases with only one type of noise (equations (2.11) and (2.13)), and no noise (equation (2.12)). We choose experimentally informed parameter values for and , as well as the parameters and the transcription delay from mouse neural progenitor cells reported in [46] and detail our simulation algorithms in electronic supplementary material, §S3. As an initial example, we choose and bursting parameter min . Intriguingly, we find that the presence of transcriptional bursting can strongly enhance oscillations in the system (figure 2).
Oscillatory gene expression can be induced and amplified by transcriptional bursting. ‘Full model’ refers to equation (2.1), ‘bursting noise only’ refers to equation (2.13), valid in the limit Ω→∞ and λ remaining finite and ‘copy-number noise only’ refers to equation (2.11) for infinitely fast transcriptional burst frequencies (λ→∞, Ω finite). The deterministic model refers to equation (2.12), valid in the combined limit Ω→∞ and λ→∞. Parameters are according to figure 6e of [46]. In detail they are: αM = 39.93 cu min−1, αP=21.56 min−1, μm=(ln2/30) min−1, μp=(ln2/90) min−1, h=4.78, P0=24201.01 cu and τ=33 min. We use Ω=1 for the full model and the copy-number noise-only model, whereas the remaining models assume Ω=∞. Similarly, we use λ=0.1 min−1 for the full model and the bursting noise-only model, where the remaining models assume λ=∞. Details of the computational implementation are available in electronic supplementary material, §S3.
Specifically, we plot in figure 2 the concentration of protein molecules as a function of time for each model. The dashed line in each panel of figure 2 is obtained in the absence of either type of noise, i.e. from the deterministic model in equation (2.12). As seen in the figure, the concentration of protein shows decaying oscillations and settles to a stable fixed point. Thus, without noise, there is no persistent oscillatory behaviour for this parameter set.
Considering simulations of the full model, comprising both copy-number noise and transcriptional bursting (equation (2.1)), we find that the time series shows persistent oscillations. Given that these oscillations are absent in the fully deterministic model (dashed line), the oscillatory behaviour must be noise-induced.
Next, we consider the piecewise-deterministic model obtained in the limit of infinite copy numbers (equation (2.13)), i.e. the model in which transcriptional bursting occurs, but copy-number noise is neglected (‘bursting noise only’ in figure 2). We again observe noise-induced oscillatory behaviour with a slightly lower amplitude than in the full model.
Finally, we consider the limit of infinitely fast bursting dynamics and finite copy numbers, as in equation (2.11) (‘only copy-number noise’ in figure 2). We again observe noise-induced oscillations. As these oscillations represent the result of an amplification of copy-number noise only, they are similar to what was reported in other cases of stochastic amplification [15–17,46,53]. Notably, the amplitude of the oscillations, in this case, is considerably lower than that in the full model or the ‘bursting noise only’ model, illustrating that the presence of transcriptional bursting changes the qualitative dynamics of the system by amplifying these oscillations.
Variation due to transcriptional bursting can be approximated with stochastic differential equations
3.2.
Our results in figure 2 indicate the importance of considering transcriptional bursting when investigating the dynamics of gene regulatory networks. Our analysis relied on simulations of the full reaction system in equation (2.1). These reactions allow us to simulate the system dynamics under the presence of transcriptional bursting and copy-number noise. They can be simulated using the Gillespie stochastic simulation algorithm [28,54] and mathematically described using the master equation formalism (see, e.g. [30, chapter 7]), or, in our case, variations of it that account for transcriptional delay [17]. However, master equations are notoriously hard to solve, and for analytical derivations, chemical Langevin equations are widely used [17,29]. Chemical Langevin equations have the added benefit of requiring orders of magnitude fewer calculations in each simulation than the Gillespie simulation algorithm.
Unfortunately, the chemical Langevin equation cannot readily be applied to systems with switching promoter states. Here, we build on the theory developed by [51] to demonstrate how the chemical Langevin equation formalism can be applied to incorporate Gaussian approximations of transcriptional bursting. This latter approximation will be valid in the limit of large but finite values of .
To derive an effective stochastic differential equation encompassing both copy number and bursting noise, we adapt the procedure first used by Gillespie in [29]. More precisely, we discretize time and then require that the time step is sufficiently small such that reaction rates and burst frequencies remain constant during each step. We simultaneously assume that the time step is large enough such that the number of reactions in each time step follows approximately a Gaussian distribution. These two assumptions can only be simultaneously fulfilled in system with large system sizes, , and large burst frequencies, . Under these assumptions, it is possible to analytically calculate the mean and variance of the Gaussian increments at each timestep, and the definition of a stochastic differential equation emerges. Our derivation is described in full detail in electronic supplementary material, §S1, where we provide formulas relating to model (2.1) as well as for more general gene regulatory networks.
We obtain the following extended chemical Langevin equations:
with Gaussian white noise variables and , each of mean zero, and with
We note two contributions to , one proportional to , and representing copy-number noise, and a second proportional to , representing noise due to transcriptional bursting. Protein production and decay are not dependent on the promoter state, and hence the noise variable originates purely from copy-number noise. Its variance is therefore proportional to . The last expression in equation (3.2) indicates that there are no correlations between and . This is because no reaction in equation (2.1) changes both the number of protein molecules and that of mRNA molecules.
The extended chemical Langevin equation (3.1) provides a possible approximation of the full model in equation (2.1). The dynamics of the chemical Langevin equation are dictated by two coupled stochastic differential equations. This is a drastic simplification of the full reaction system, which is described by an extended chemical master equation that accounts for the presence of a delay [17,55]. Mathematically, this chemical master equation constitutes an infinitely large set of coupled ordinary differential equations [30, chapters 5 and 7]. Nonetheless, the chemical Langevin equation is nonlinear, making it analytically intractable. An analytically tractable equation can be obtained by linearizing. This involves one further step of approximation and results in the so-called linear-noise approximation.
Our formulation of the linear-noise approximation applies to parameters for which the deterministic system in equation (2.12) tends to a stable fixed point . We linearize about this fixed point by writing
We note that and describe fluctuations of particle concentrations (as opposed to fluctuations of particle numbers) about the deterministic fixed point.
Assuming that and are small quantities, we then obtain the following linear stochastic delay differential equations:
where
A key element of the linear-noise approximation is to replace the second moments of the noise variables in the chemical Langevin equation, equation (3.2), with the values of these expressions at the deterministic fixed point. This leads to Gaussian white noise variables and with properties
where
Stochastic differential equations accurately describe protein dynamics
3.3.
As an initial test of this approximation, we plot individual trajectories of protein concentration for the full model (equation (2.1)), the chemical Langevin equation (3.1) and the linear-noise approximation (3.4) in figure 3A. We choose large values of and for this test, as that is the regime for which we expect the approximation to work well. Specifically, we use and min . All three models exhibit oscillations, indicating that the chemical Langevin equation can capture the effect of noise-induced cycles. Promisingly, all three trajectories appear to have similar qualitative properties, such as the period and amplitude of the oscillations. The potential benefit of using the chemical Langevin equation becomes immediately obvious when comparing computation times: using our implementation, generating trajectories of the chemical Langevin equation for this parameter combination is 463 times faster than generating trajectories of the full model.
Stochastic differential equations accurately describe protein dynamics. For the following models, we show individual trajectories in (A) and power spectra in (B). The full model is a Gillespie simulation of equation (2.1), ‘CLE’ stands for chemical Langevin equation (see equation (3.1)), and LNA stands for ‘linear-noise approximation’ (equation (3.4)). ‘LNA theory’ refers to the analytically obtained power spectrum in equation (3.8). The parameters used are the same as in figure 2, except that here Ω=100 and λ=10min−1. All three models exhibit noise-induced oscillations and their power spectra match. Note that ω represents angular frequencies. To obtain the power spectra, 100 trajectories of each model are generated. They are simulated for T = 10 000 min, and the first 200 min are discarded to account for equilibration. Further details of the computational implementation are available in electronic supplementary material, §S3.
While this visual comparison of trajectories is promising, our next aim must be to evaluate quantitatively how well the two models compare. An ideal measure for this comparison would be one that can characterize the properties of the dynamics visible in figure 3A. Such a measure is available in the form of the Fourier power spectrum of the protein time series. Fourier analysis is a standard method to decompose a time series (such as those in figure 3A) into contributions from oscillatory signals of varying frequencies. The power spectrum summarizes this analysis and captures, for each angular frequency , the square of the amplitude, i.e. the strength, of this contribution. Conveniently, the power spectrum of the linear-noise approximation equation (3.4) can be analytically calculated.
We conduct this calculation in the electronic supplementary material, §S1.6. We find
The denominator in these expressions is
For the full model and the chemical Langevin equation, the power spectrum can instead be obtained numerically from many repeated simulation runs.
Comparing the protein power spectra for all three models for the same parameters as before (i.e. figure 6e from [46] and min , ) reveals that all three models have nearly identical spectra (figure 3B), highlighting the accuracy of our approximation and demonstrating that our analytical derivations are valid. Hence, the oscillations of the full model in figure 3 are well described by the analytically obtained power spectrum equation (3.8).
Approximation by stochastic differential equations breaks down for small burst frequencies
3.4.
Our derivation of the extended chemical Langevin equation (3.1) made the assumption that the bursting parameter, , and the system size, , are large. But what exactly does large mean? Or, in other words, at what values of or will the approximation by stochastic differential equations break down? To answer this question, we next seek to computationally estimate this limit by analysing the standard deviations of time series of protein concentration,
where stands for a time average in the stationary state, i.e. after long waiting times. This quantity is a measure of the amplitude of noise-induced oscillations about a deterministic fixed point. Where the power spectrum is analytically available, the standard deviations can be calculated from the power spectrum via the relation
A derivation of this relationship is provided in the electronic supplementary material, §S1.7. We can use equation (3.8) to analytically predict from the linear-noise approximation.
We analyse how the standard deviation in protein time series of each model, i.e. the full model (2.1), the chemical Langevin equation (3.1) and the linear-noise approximation equation (3.4), depends on the bursting parameter for a fixed system size of (figure 4A). The theory derived within the linear-noise approximation captures the standard deviations of the full model quantitatively for sufficiently large burst frequencies . When bursting is slow on the other hand, significant deviations between the linear-noise approximation and simulations of the full model are observed.
Approximation by stochastic differential equations breaks down for small burst frequencies. (A) standard deviations of the concentration of protein molecules, ΣP, in the stationary state as a function of the bursting parameter, λ, for multiple models. The labels ‘Full model’, ‘CLE’ and ‘LNA’ are defined in figure 3. Additionally, we include the analytically calculated standard deviation from equation (3.11) as ‘LNA theory’. (B) The relative error rΣ according to equation (3.12) for all approximations considered in (A). Parameters are the same as in figure 2, with Ω=100. For values of λ with λ≤0.1, two trajectories are created to calculate ΣP and rΣ. For values of λ with λ>0.1, 20 trajectories are used. Each trajectory is simulated for T=105 min, and the first 2000 min are discarded. Further details of the computational implementation are available in electronic supplementary material, §S3.
To be able to accurately identify the point of deviation between the models, we define the relative error to the full model
where indicates the protein standard deviations of the model in question (the chemical Langevin equation (3.1), the linear-noise approximation in equation (3.4) or the theoretical result using equations (3.11) and (3.8)) and refers to the protein standard deviation of the full model (2.1).
The linear-noise approximation becomes accurate to at most a error in this example roughly when bursts occur at least once per minute ( ; figure 4B). Even for much slower burst frequencies, the approximation produces reasonable ball park estimates for the magnitude of fluctuations (and hence for the amplitude of noise-induced oscillations). The relative error between the linear-noise approximation and the full model remains below one for burst frequencies on a time scale of . That is, the linear-noise approximation produces estimates that are within a factor of two of the true standard deviations for burst frequencies with time scales of approximately an hour.
Figure 4A also shows that the standard deviations reduces for increased burst frequencies . This is because the strength of bursting noise scales as (to leading order). Thus, for high values of , the promoter’s ON/OFF dynamics average out. Fluctuations induced by bursting then become less pronounced in the dynamics of the gene regulatory network. We would not expect to approach zero in the limit of infinitely fast bursting ( ). This is because the effects of intrinsic noise remain in this limit, so we expect oscillations purely induced by copy-number noise, as observed for example in figure 2 for the model with only copy-number noise.
The system-size parameter is held fixed at in figure 4. For very slow bursting dynamics (small values of ), we then expect bursting noise to be much stronger than copy-number noise. Oscillations are then predominantly induced by bursting noise. Within the linear-noise approximation, their amplitude can be expected to scale as , and the contributions containing in equations (3.7) and (3.8) become irrelevant. This is indeed what we see in figure 4A.
Suitability of our approximation is application dependent
3.5.
Our analysis of the breakdown of the approximation in figure 4 assumed that , i.e. we only identified a critical value of . However, our derivation of the chemical Langevin equation (3.1) required us to assume that both and are large. Hence, we would expect the accuracy of the extended chemical Langevin equation to depend on both and , with the possibility that, for each , different values of are required for accuracy of the approximation.
We analyse the joint dependence of the accuracy of the chemical Langevin equation (3.1) on and in figure 5, again focusing on our measure of relative error equation (3.12). We find that the predictions from the chemical Langevin equation reproduce the standard deviations of protein fluctuations up to an error of at most when and (figure 5). In the figure, all other parameters are as in figures 2–4, matching those in [46].
Relative error rΣ of standard deviations of the chemical Langevin equation (3.1) compared to the full model in equation (2.1). The relative error is calculated using equation (3.12). The black cross represents the grid point λ=0.1 min−1 and Ω=1. Parameters are the same as in figure 2. For grid points where λ≤0.1 min−1, two trajectories are created to calculate rΣ. For grid points where λ>0.1 min−1, 20 trajectories are created. Each trajectory is simulated for 105 min, and the first 2000 min are discarded. Details of the computational implementation are available in electronic supplementary material, §S3.
While our choice of cu together with a value of is consistent with experiments [46], the requirement indicates that the bursts would have to occur once every 2 min or faster for the chemical Langevin equation to become accurate to an error of less than . That leaves us to ask in which regime experimentally measured values for the burst parameter lie. In our model equation (2.1), represents the rate at which the promoter switches from the OFF to the ON state. This rate varies widely between individual genes, and between organisms. For example, live-imaging in yeast suggested of 1−4 min [36], whereas measurements on a range of bacterial genes found 0.3 min [37]. Measurements in the fruit fly Drosophila indicate values of around 0.15 min by [56] and up to 2 min by [57]. Bursting in mammalian cells is considered to be slower than in other organisms, with values of around 0.01 min reported for a range of genes by [38]. Yet, for HIV-1 RNA in HeLa cells, bursting frequencies of up to 0.6 min have been observed [58].
Considering these experimental measurements, we investigate the representative value , in figure 5. At this parameter combination, protein fluctuations in the extended chemical Langevin model deviate from that in the full model by approximately (figure 5). For larger , the approximation becomes more accurate; for smaller , the approximation eventually breaks down. This illustrates that the applicability of our method will depend on the biological scenario: in some cases, an approximation using our extended chemical Langevin equation will work well. In other situations, it will be more suitable to use a piecewise continuous chemical Langevin equation, i.e. a model following Gillespie’s derivation to approximate copy-number noise [29], but simulating promoter switching explicitly, such as in equation (2.13). For small system sizes and small , it will be necessary to use the full model, equation (2.1). This is a workable approach, as stochastic simulations using the Gillespie algorithm will be fast in this scenario.
The breaking point of the approximation in our model is at around min . This breaking point may be different for other transcription factor networks: the timescale of fluctuations of gene regulatory networks is typically set by degradation rates [26, chapters 1−3], and one may expect that the applicability of our Gaussian approximation depends on how the bursting parameter relates to degradation rate of the mRNA that is being produced. For more long-lived mRNAs, such as those involved in the circadian clock, slower bursting frequencies may be permissive in our approximation. In a given application, it will hence be beneficial to test the applicability of the chemical Langevin approximation through comparison to test simulations of the corresponding full, unapproximated model.
We expect that the requirement of sufficiently large is met for a large number of genes. Specifically, we expect to be sufficiently large if the number of protein molecules per nucleus exceeds a few hundred to a few thousand. Our figure 5 illustrates this, as the relative error falls below 5% as , which would correspond to around 20 000 protein molecules per nucleus. More than half of all quantified genes in a global screen conducted by [59] fall within that range.
The chemical Langevin equation we derived assumes Gaussian increments in mRNA and protein copy number between consecutive time intervals. In cases where the approximation breaks down when the bursting parameter is low, we expect increments to be non-Gaussian, suggesting that slow transcriptional bursting is an inherently non-Gaussian effect. This is consistent with experimental observations that mRNA distributions can be non-Gaussian and non-Poissonian [32].
Stochastic differential equation can approximate other gene regulatory networks
3.6.
In equation (2.1), we chose the rate to depend on the current protein concentration . This follows a model for repressive transcription factors initially proposed by [26] (see appendix A of the reference) and is designed to generate the typical Hill function (2.10) in the stationary limit. However, alternative functional implementations of the action of a transcription factor are possible. For example, multiple recent studies suggest that transcriptional repressors affect the rate of transitions, thus modulating the waiting times in the OFF period (e.g. [60]). Our framework of the chemical Langevin equations to approximate the effect of promoter switching is suitable for such alternative parametrizations, and we illustrate this in electronic supplementary material, §S2 and figures S1 and S2.
Similarly, our method is applicable to other gene regulatory networks. A seminal example of such a gene network is the bistable toggle switch, which has been successfully engineered into Escherichia coli bacteria [61]. The standard model for this switch considers two protein concentrations without representing mRNAs or delays. The two protein concentrations mutually repress each other and can each be degraded. This model can be extended to account for multiple promoter states (figure 6); see [62]. Choosing a promoter parametrization for both components of the switch inspired by equation (2.1), we arrive at the chemical reaction system
Schematic of the toggle-switch model. The model describes two protein concentrations A and B and the effects of separate promoters for each. These can be in an active or an inactive state (on/off). Both proteins can be degraded. The promoter of each protein can be inactivated by the presence of the respective other protein.
where represents promoter ON states for proteins A or B, respectively, and represents the promoter OFF states. The bursting parameter regulates the burst frequency and duration in the ON states of both promoters. Each promoter experiences the repression by the respective other protein, with the same repression threshold and the same Hill coefficient . Both proteins are degraded with degradation rate , and in the respective ON states of the promoter, the proteins are produced by the transcription rate . The auxiliary variable is equal to one if the A promoter is in the ON state and equal to zero if the promoter is in the OFF state . Similarly, takes the values 1 and 0 depending on the state of the B promoter. We have chosen both proteins to share the same parameters for production, degradation and repression for simplicity. The concentrations A and B, are given by and , respectively, where and are the number of and molecules and refers to the nuclear volume. For convenience, we again write , with determining the volume of a reference nucleus. Concentrations are again measured in cu, referring to one molecule per volume .
Intriguingly, this system can exhibit transitions between two steady states that are enabled by bursting noise. Specifically, for the parameter combination of cu, , min , cu min^−1^ and , we see bistability for min . Starting from an initial condition of the concentrations cu and cu, we find that the concentration of molecules approaches a high steady state, fluctuating around the value of approximately 9 cu (figure 7A), whereas the concentration of A fluctuations around a lower value of approximately 1 cu. Swapped stationary concentrations for and occur in different realizations of the process. Using this same parameter combination but with min , stochastic transitions can be observed, where each of the concentrations and is low during some periods and high during others (figure 7B). The parameter combinations used in figure 7A,B thus identify a regime in which promoter bursting is functionally important. Therefore we test our approximation by chemical Langevin equations for this parameter combination.
Noise-induced transitions in the toggle-switch model. (A) A representative simulation of the toggle-switch model in equation (3.13) using the parameter combination P0=3 cu, h=2, μ=0.1 min−1, α=1 cu min−1, λ=100 min−1 and Ω=100, simulated using the Gillespie stochastic simulation algorithm [28]. (B) A representative simulation of the toggle-switch model equation (3.13) using the same parameters as in (A), except that now λ=1 min−1. (C) A representative simulation of the chemical Langevin equation for the toggle-switch model, equation (3.14), using the same parameters as in (A), simulated using an Euler–Maruyama scheme and a timestep of 0.01 min. The dynamics are similar to (A), except that the stationary concentration profiles of A and B are swapped. (D) A representative simulation of the chemical Langevin equation for the toggle-switch model, equation (3.14), using the same parameters as in (B).
The chemical Langevin approximation of this system emerges as
with Gaussian white noise variables and , each of mean zero, and with
Indeed, we find that the chemical Langevin equations approximate the dynamics of the toggle-switch well. Protein concentrations simulated by the chemical Langevin equation (3.14) fluctuate on a timescale comparable to those simulated by the Gillespie algorithm, and have comparable amplitude (compare figure 7C,D with 7A,B, respectively). These simulations by the chemical Langevin equation also capture the noise-induced transitions between the two states of the model when comparing the cases min and min (figure 7C,D).
Importantly, the stationary distribution obtained from Gillespie simulations of the full model equation (3.13) is well approximated by the chemical Langevin equation (3.14) (figure 8A). The means and standard deviations of both distributions are within <0.5% of each other. Similarly, the distribution of waiting times in each of the stable states is well approximated by the chemical Langevin equation (figure 8B). The means and standard deviations of the waiting time distribution sampled from the chemical Langevin equation differ by less than 10% from the distribution generated using the Gillespie stochastic simulation algorithm (for approx. 11 000 sampled waiting times).
Chemical Langevin equations approximate the toggle-switch dynamics well. (A) Stationary distribution of the concentration of A molecules for simulations of the Gillespie stochastic simulation algorithm in equation (3.13), and the chemical Langevin equation (3.14). (B) Distribution of waiting times to switch from the configuration (A high, B low), to (A low, B high), for both models. For both panels, the same parameter combination as in figure 7B was used. Both panels analyse simulation output from one simulation of each model that was run for a duration of T=107 min. The chemical Langevin equation was simulated using an Euler–Maruyama scheme with a timestep of 1 min. To calculate waiting times, each time series was smoothened by a sliding average of 1000 min. Then, the length of any time window was recorded in which the smoothened time series of the A concentration was greater than 4 cu, and where the smoothened concentration of B was less than 4 cu. The histogram is generated from all waiting times recorded in this way.
In summary, our chemical Langevin approximation is applicable and accurate for other gene regulatory networks, following the theory provided in electronic supplementary material, §S1.
Summary and discussion
The importance of noise in gene regulatory networks is widely recognized [9,20,32]. Here, we investigated how noise due to transcriptional bursting affects the dynamics of the broadly studied auto-negative feedback motif and found that bursting can induce oscillations when they are not otherwise present and enhance oscillations that may already be present in the system. We provided computationally efficient approximations of the system dynamics and illustrated the utility of our approximation through analytical derivations of the power spectrum of the observed oscillations. Simulations of the chemical Langevin equation are multiple orders of magnitude faster than those of the full system. Our calculations confirm mathematically that transcriptional bursting alone can induce oscillations and that transcriptional bursting can enhance the amplitude of oscillations if these are already present due to copy-number noise. We investigated the breaking point of our extended chemical Langevin equations and related it to biological measurements of transcriptional burst frequencies and showed how our theory can be extended to other gene regulatory networks.
Our models considered a single active promoter for each gene. Our approach is nonetheless applicable to systems with more active promoters, i.e. genes with more than one active allele in haploid or even multiploid cells [33,63,64]. This would be implemented by assuming more transcriptional states, representing the different combinations of promotor activity on each allele. We further highlight that the observed oscillations in our model are only visible in the stationary regime, i.e. when a gene is constitutively expressed, and cannot be used to describe genes that recently turned on, for example.
Varying definitions of burstiness exist in the literature. The concept was originally introduced to describe mRNA production that is discontinuous in time [20,65]. This is the phenomenon we seek to study here. Other authors exclusively consider the gene to undergo transcriptional bursting if the OFF periods of the promoter are much longer than the ON periods [21,66,67]. Our approximations are applicable under both of these definitions of burstiness, i.e. they can describe systems where OFF periods are much longer than ON periods, but this is not a necessary restriction for the validity of our methods. We are instead focused on the consequences of the random switching between the OFF and the ON states of the promoter, rather than their relative duration.
In the language of transcriptional bursting, our bursting parameter affects the duration of ON periods and of OFF periods of the promoter and, therefore, the bursting initiation rate as well as the burst size, i.e. the number of transcripts per transcriptional burst. This dependency is chosen such that, when varying the bursting parameter , we vary the timescale of bursting, but not the expected relative durations of ON and OFF periods. This joint dependency of both rates on is necessary to ensure that our changes in promoter dynamics do not affect the average transcription rate of the promoter, which is known to impact the oscillatory properties of the model [45].
Following these design choices, the limits and allow us to construct situations in which either copy-number noise or promotor-bursting noise are ‘turned off’. Being able to do this is a distinct advantage of the model, as it means we can investigate the consequences of each type of noise individually. Identifying protocols to perform analogous investigations experimentally would be challenging and may not be possible with existing technologies. For example, to change the system-size parameter, the experiment of interest would require (i) an increased production of mRNA, (ii) a decreased sensitivity of the promoter to the number of protein molecules and (iii) maintaining an identical burst frequency while doing so. Increased production of mRNA (i) may be achieved through upstream regulation by a relevant transcription factor. However, that may also affect bursting kinetics. Changing the sensitivity to protein may be achieved through binding of protein molecules via dimerization or modifying concentrations of a transcriptional co-regulator. Both of these would need to be done in a way that does not affect the bursting kinetics. Similarly, changing would require changing the bursting kinetics in a way that affects both the burst frequency as well as the burst size, which would possibly require a modification of the promoter or enhancer sequences, as well as a change in upstream regulation. In summary, the model allows us to achieve understanding that would be difficult to obtain by experiments alone.
Our investigation continues a long line of research on the delayed auto-negative feedback motif [8,12,17,26,45–47,55,68–77]. Among this work, [68] pointed out that the model presented in equation (2.12) is insufficient to explain the full amplitude of biologically observed oscillations, and they hence argued that alterations to the model, such as the consideration of protein dimerization or periodic forcing, are necessary to alleviate this discrepancy. Our results suggest that, instead, the presence of transcriptional bursting may be sufficient to explain an increase in oscillation amplitude, a result in line with previous findings on a bursting model with negative feedback [78]. We expect the auto-negative feedback motif to remain relevant as auto-repression is abundant in gene regulatory networks, suggesting the existence of yet unknown oscillators [79].
Our work contributes to existing efforts to efficiently simulate the dynamics of gene regulatory networks, such as the methods presented by [80]. Simulating chemical Langevin equations is generally much less demanding in terms of computing time than the simulation of the full reaction system from which the chemical Langevin equation is derived. Where our approximation is applicable, this means that equations such as equation (3.1) are a viable approach to studying the role of transcriptional bursting in natural gene regulatory networks. For example, high-throughput simulations of the chemical Langevin model can be used to infer model parameters from experimental data [69,81].
When deriving our extended chemical Langevin equation in electronic supplementary material, §S1, we introduce the promoter states as environmental states in a chemical reaction system. The response of chemical reaction systems to rapidly switching environments in this way was previously derived in [51,82]. This work also included levels of approximation for large copy numbers or fast environmental switching rates, respectively. In these studies, the authors used a Kramers–Moyal expansion to arrive at an extended chemical Langevin equation. Our analysis expands this previous work by providing an alternative derivation of the extended chemical Langevin equation in the style of [29]. This derivation is advantageous to us over the Kramers–Moyal expansion, as it allows a natural inclusion of delayed terms. Without this inclusion of delayed terms, our discussion of the delayed auto-negative feedback motif would not have been possible.
Our work opens up possibilities in a wider range of areas. Copy-number fluctuations are known to generate oscillations not only in gene regulatory circuits but also in other biochemical systems, in models of epidemic spread and in evolutionary game theory [16,83,84]. Such stochasticity, sometimes called ‘demographic noise’ in other areas, can also provoke spatial patterns or waves [85–87]. It is conceivable that these phenomena can also be induced by extrinsic noise or be enhanced by it. Further work is required to investigate this possibility.
In the context of gene regulatory networks, we hope that our discussion of the extended chemical Langevin equation, as well as the accompanying approximations for large or finite burst frequencies or system sizes, will facilitate the wider study of gene regulation and cell fate.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Davidson EH. 2010 Emerging properties of animal gene regulatory networks. Nature 468, 911–920. (10.1038/nature 09645)21164479 PMC 3967874 · doi ↗ · pubmed ↗
- 2Patke A, Young MW, Axelrod S. 2020 Molecular mechanisms and physiological importance of circadian rhythms. Nat. Rev. Mol. Cell Biol. 21, 67–84. (10.1038/s 41580-019-0179-2)31768006 · doi ↗ · pubmed ↗
- 3Foster RG. 2020 Sleep, circadian rhythms and health. Interface Focus 10, 20190098. (10.1098/rsfs.2019.0098)32382406 PMC 7202392 · doi ↗ · pubmed ↗
- 4Papagiannakis A, Niebel B, Wit EC, Heinemann M. 2017 Autonomous metabolic oscillations robustly gate the early and late cell cycle. Mol. Cell 65, 285–295. (10.1016/j.molcel.2016.11.018)27989441 · doi ↗ · pubmed ↗
- 5Kalsbeek A, Fliers E. 2013 Daily regulation of hormone profiles. In Circadian clocks handbook of experimental pharmacology (eds A Kramer, M Merrow), pp. 185–226. Berlin, German: Springer. (10.1007/978-3-642-25950-0_8)23604480 · doi ↗ · pubmed ↗
- 6Miao Y, Pourquié O. 2024 Cellular and molecular control of vertebrate somitogenesis. Nat. Rev. Mol. Cell Biol. 25, 517–533. (10.1038/s 41580-024-00709-z)38418851 PMC 11694818 · doi ↗ · pubmed ↗
- 7Kageyama R, Niwa Y, Shimojo H, Kobayashi T, Ohtsuka T. 2010 Ultradian oscillations in Notch signaling regulate dynamic biological events. In Current topics in developmental biology notch signaling, pp. 311–331. San Diego, CA, USA: Academic Publishing. (10.1016/s 0070-2153(10)92010-3)20816400 · doi ↗ · pubmed ↗
- 8Phillips NE et al. 2016 Stochasticity in the mi R-9/Hes 1 oscillatory network can account for clonal heterogeneity in the timing of differentiation. e Life 5, e 16118. (10.7554/elife.16118)27700985 PMC 5050025 · doi ↗ · pubmed ↗
