Efficient approximations of transcriptional bursting effects on the dynamics of a gene regulatory network

Jochen Kursawe; Antoine Moneyron; Tobias Galla

PMC · DOI:10.1098/rsif.2025.0170·June 25, 2025

Efficient approximations of transcriptional bursting effects on the dynamics of a gene regulatory network

Jochen Kursawe, Antoine Moneyron, Tobias Galla

PDF

Open Access

TL;DR

This paper shows how transcriptional bursting can change gene network dynamics and introduces efficient methods to model it.

Contribution

A novel extension of the chemical Langevin equation to include transcriptional bursting noise is derived.

Findings

01

Transcriptional bursting can induce or amplify oscillations in gene regulatory networks.

02

The extended Langevin equation reduces computation time while preserving analytical tractability.

03

The study provides guidelines for when and how to apply different approximation methods.

Abstract

Mathematical models of gene regulatory networks are widely used to study cell fate changes and transcriptional regulation. When designing such models, it is important to accurately account for sources of stochasticity. However, doing so can be computationally expensive and analytically untractable, posing limits on the extent of our explorations and on parameter inference. Here, we explore this challenge using the example of a simple auto-negative feedback motif, in which we incorporate stochastic variation due to transcriptional bursting and noise from finite copy numbers. We find that transcriptional bursting may change the qualitative dynamics of the system by inducing oscillations when they would not otherwise be present, or by magnifying existing oscillations. We describe multiple levels of approximation for the model in the form of differential equations, piecewise-deterministic…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Species6

Saccharomyces cerevisiae(baker's yeast · species)Mus musculus(house mouse · species)Escherichia coli(E. coli · species)Human immunodeficiency virus 1(no rank)Homo sapiens(human · species)Drosophila melanogaster(fruit fly · species)

Cell lines1

HeLa— Homo sapiens (Human) · Human papillomavirus-related endocervical adenocarcinoma · Cancer cell line

Chemicals1

A

Figures8

Click any figure to enlarge with its caption.

Schematic of the transcriptional auto-repressive feedback loop. Our model includes the effect of a promoter that can be in an active or an inactive state (on/off), as well as the dynamics of mRNA and protein. These are subject to transcription, translation and degradation. The promoter can be inactivated by the presence of protein. The two dashes on the arrow for transcription denote the presence of a transcriptional delay τ.

Oscillatory gene expression can be induced and amplified by transcriptional bursting. ‘Full model’ refers to equation (2.1), ‘bursting noise only’ refers to equation (2.13), valid in the limit Ω→∞ and λ remaining finite and ‘copy-number noise only’ refers to equation (2.11) for infinitely fast transcriptional burst frequencies (λ→∞, Ω finite). The deterministic model refers to equation (2.12), valid in the combined limit Ω→∞ and λ→∞. Parameters are according to figure 6e of [46]. In detail they are: αM = 39.93 cu min−1, αP=21.56 min−1, μm=(ln2/30) min−1, μp=(ln2/90) min−1, h=4.78, P0=24201.01 cu and τ=33 min. We use Ω=1 for the full model and the copy-number noise-only model, whereas the remaining models assume Ω=∞. Similarly, we use λ=0.1 min⁣−1 for the full model and the bursting noise-only model, where the remaining models assume λ=∞. Details of the computational implementation are available in electronic supplementary material, §S3.

Stochastic differential equations accurately describe protein dynamics. For the following models, we show individual trajectories in (A) and power spectra in (B). The full model is a Gillespie simulation of equation (2.1), ‘CLE’ stands for chemical Langevin equation (see equation (3.1)), and LNA stands for ‘linear-noise approximation’ (equation (3.4)). ‘LNA theory’ refers to the analytically obtained power spectrum in equation (3.8). The parameters used are the same as in figure 2, except that here Ω=100 and λ=10min−1. All three models exhibit noise-induced oscillations and their power spectra match. Note that ω represents angular frequencies. To obtain the power spectra, 100 trajectories of each model are generated. They are simulated for T = 10 000 min, and the first 200 min are discarded to account for equilibration. Further details of the computational implementation are available in electronic supplementary material, §S3.

Approximation by stochastic differential equations breaks down for small burst frequencies. (A) standard deviations of the concentration of protein molecules, ΣP, in the stationary state as a function of the bursting parameter, λ, for multiple models. The labels ‘Full model’, ‘CLE’ and ‘LNA’ are defined in figure 3. Additionally, we include the analytically calculated standard deviation from equation (3.11) as ‘LNA theory’. (B) The relative error rΣ according to equation (3.12) for all approximations considered in (A). Parameters are the same as in figure 2, with Ω=100. For values of λ with λ≤0.1, two trajectories are created to calculate ΣP and rΣ. For values of λ with λ>0.1, 20 trajectories are used. Each trajectory is simulated for T=105 min, and the first 2000 min are discarded. Further details of the computational implementation are available in electronic supplementary material, §S3.

Relative error rΣ of standard deviations of the chemical Langevin equation (3.1) compared to the full model in equation (2.1). The relative error is calculated using equation (3.12). The black cross represents the grid point λ=0.1 min⁣−1 and Ω=1. Parameters are the same as in figure 2. For grid points where λ≤0.1 min⁣−1, two trajectories are created to calculate rΣ. For grid points where λ>0.1 min⁣−1, 20 trajectories are created. Each trajectory is simulated for 105 min, and the first 2000 min are discarded. Details of the computational implementation are available in electronic supplementary material, §S3.

Schematic of the toggle-switch model. The model describes two protein concentrations A and B and the effects of separate promoters for each. These can be in an active or an inactive state (on/off). Both proteins can be degraded. The promoter of each protein can be inactivated by the presence of the respective other protein.

Noise-induced transitions in the toggle-switch model. (A) A representative simulation of the toggle-switch model in equation (3.13) using the parameter combination P0=3 cu, h=2, μ=0.1 min⁣−1, α=1 cu min−1, λ=100 min⁣−1 and Ω=100, simulated using the Gillespie stochastic simulation algorithm [28]. (B) A representative simulation of the toggle-switch model equation (3.13) using the same parameters as in (A), except that now λ=1 min⁣−1. (C) A representative simulation of the chemical Langevin equation for the toggle-switch model, equation (3.14), using the same parameters as in (A), simulated using an Euler–Maruyama scheme and a timestep of 0.01 min. The dynamics are similar to (A), except that the stationary concentration profiles of A and B are swapped. (D) A representative simulation of the chemical Langevin equation for the toggle-switch model, equation (3.14), using the same parameters as in (B).

Chemical Langevin equations approximate the toggle-switch dynamics well. (A) Stationary distribution of the concentration of A molecules for simulations of the Gillespie stochastic simulation algorithm in equation (3.13), and the chemical Langevin equation (3.14). (B) Distribution of waiting times to switch from the configuration (A high, B low), to (A low, B high), for both models. For both panels, the same parameter combination as in figure 7B was used. Both panels analyse simulation output from one simulation of each model that was run for a duration of T=107 min. The chemical Langevin equation was simulated using an Euler–Maruyama scheme with a timestep of 1 min. To calculate waiting times, each time series was smoothened by a sliding average of 1000 min. Then, the length of any time window was recorded in which the smoothened time series of the A concentration was greater than 4 cu, and where the smoothened concentration of B was less than 4 cu. The histogram is generated from all waiting times recorded in this way.

Tables1

Table 1.. Overview of the different descriptions of the combined system of transcriptional bursting and protein–mRNA dynamics. Further details are available in the main text.

	finite copy numbers (Ω < ∞)	infinite copy numbers (Ω < ∞)
finite transcriptional bursting parameter (λ < ∞)	copy number noise and transcriptional bursting noise full model, equation (2.1)	only transcriptional bursting noise piecewise-deterministic dynamics, equation (2.13)
infinite transcriptional bursting parameter (λ < ∞)	only copy-number noise reaction system in equation (2.11)	no noise, delay differential equation (2.12)

Equations28

Funding1

—María de Maeztu Unit of Excellence

Keywords

gene regulatory networksgene expression oscillationstranscriptional burstingfast-switching environmentspiecewise-deterministic Markov processeschemical Langevin equations

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGene Regulatory Network Analysis · Evolution and Genetic Dynamics · stochastic dynamics and bifurcation

Full text

Introduction

Gene regulatory networks are at the heart of many developmental and physiological processes and are a key component of cell fate decisions [1]. The expression of proteins participating in gene regulatory networks, i.e. transcription factors, is inherently dynamic. A fundamental example of dynamic transcription factor expression is gene expression oscillators, such as the circadian clock [2–5], or rapid ultradian oscillators governing the formation of somites [6]. Rapid ultradian gene oscillators also encode information necessary for cell-state transitions, such as the differentiation of neural progenitor cells into neurons or glial cells [7,8].

It is surprising that dynamic signatures of gene expression, such as oscillations, are correctly established and interpreted when considering the prevalence of stochastic effects [9]. The copy numbers of protein and mRNA molecules in gene regulatory networks are often sufficiently small that simple mathematical approaches based on deterministic rate equations fail to provide even a qualitatively accurate description [10–13]. For small copy numbers, the intrinsic stochasticity of transcription and translation processes and the degradation of protein and mRNA cannot be neglected. This raises the key question whether noise is detrimental to the functioning of gene regulatory processes, or if indeed, stochasticity can be exploited [14]. An example of the latter is stochastic amplification, in which a stochastic dynamical system may show persistent oscillations in circumstances when its corresponding deterministic system does not [15–17].

Gene regulatory networks are subject to other types of stochasticity, beyond copy-number noise. For example, conditions external to the cellular environment may vary in time or cell divisions might be uneven [18,19]. One additional source of randomness is transcriptional bursting, describing the phenomenon that mRNA production is discontinuous in time and occurs in bursts of transcriptional activity. The size and timing of such bursts varies stochastically [20,21].

Mathematical models are a widely used tool to study and identify gene regulatory networks [22–24], and their description through the law of mass action [25–27] can easily be extended to account for copy-number noise through the Gillespie stochastic simulation algorithm [28] or the chemical Langevin equation [29,30]. It is an ongoing challenge to understand how gene regulatory networks integrate other types of noise, such as transcriptional bursting, to robustly establish dynamical signatures like oscillations and cell fate transitions.

To mathematically describe experimental observations on transcriptional bursting of a single gene (i.e. not considering responses within wider gene regulatory networks), the so-called telegraph model has been widely used. In this simplest model of transcriptional bursting, the gene randomly switches between states of active and inactive transcription [31,32]. Considerable effort has been expended to parametrize versions of the telegraph model using experimental data and a recent overview of such approaches is for example available from Luo et al. [33]. Typically, the comparison between models and data is performed using steady-state distributions of mRNA copy numbers, which can be predicted from the model and experimentally measured using scRNA-seq data [34] or smFISH [35]. In some cases, bursting dynamics are directly observed through live-imaging and used to identify parameters of telegraph models and related motifs [36–39].

In addition to these efforts at the level of a single gene, many studies investigate the effects of transcriptional bursting into wider gene regulatory networks. An example is inference approaches that take regulatory interactions into account when inferring properties of transcriptional bursting [40,41]. Recent work by [42] points out that co-fluctuations between transcription factors and their targets may inform inference, and earlier works by [43] and [44] have discussed in detail how molecular processes of transcriptional as well as translation bursting may be incorporated into stochastic descriptions of gene regulatory networks. These efforts often rely on exact simulations of the stochastic dynamics, which can be computationally costly especially when copy numbers of the involved molecules are large. In addition, exact descriptions of the stochastic processes involved are typically not analytically tractable. These shortcomings limit the potential of stochastic simulations in studying gene regulatory networks.

Here, we investigate the delayed auto-negative feedback motif, in which we incorporate the effects of transcriptional bursting as well as noise due to finite copy numbers. Using the example of this motif, we introduce multiple levels of approximations of gene regulatory dynamics valid in the individual or joint limits of large copy numbers or fast bursting. As part of this effort, we extend the classical chemical Langevin equation to account for the Gaussian effects of transcriptional bursting in gene regulatory networks. We show that stochastic variation due to transcriptional bursting can enable oscillations when they are not otherwise present or enhance oscillations if they are already present. Our approximations allow us to analytically determine the power spectrum of the oscillations and investigate for what system sizes and burst frequencies our approximation becomes invalid. Our derivations of the extended chemical Langevin equation are provided in general form. These equations are approximations that drastically reduce computation times and are applicable to any gene regulatory network and many models of transcriptional bursting.

The remainder of the paper is organized as follows. In the model description of §2, we define the model of the auto-repressive network of a single gene, explicitly including finite copy numbers and transcriptional bursting, as well as transcriptional delay. We demonstrate that our model simplifies to previously published versions of the model in the limit of infinitely fast transcriptional bursting and discuss the alternative limiting case where copy-number noise is absent. In the results §3, we first discuss the effect of transcriptional bursting on the system dynamics before presenting our main analytical results. Through a systematic expansion in the inverse noise strength, we are able to calculate the Fourier spectra of oscillations induced by copy-number noise and/or noise due to bursting. We proceed to test these theoretical predictions against simulations and assess the validity of the various reduced models for different regimes of transcriptional bursting and intensity of copy-number noise. We finally include a discussion in §4. Further details of the mathematical analysis are collated in the electronic supplementary material.

Model description

Gene regulatory network with negative autoregulation and transcriptional bursting

2.1.

We start by introducing a simple model of a transcription factor subject to auto-negative feedback (figure 1). Importantly, this model accounts for transcriptional bursting as well as stochastic variation due to low copy numbers. The model extends previous work [17,45] and includes the effects of transcription, translation and degradation of mRNA and protein. Motivated by the classical telegraph model of transcription, the promoter of the gene can assume two states: one state in which transcription is active with a constant rate, and one state in which the transcription does not occur. The promoter switches between these two states stochastically, and the rate of switching into the OFF state depends on the amount of protein present in the system. This simulates the auto-negative feedback, whereby a high abundance of protein can inhibit transcription.

Schematic of the transcriptional auto-repressive feedback loop. Our model includes the effect of a promoter that can be in an active or an inactive state (on/off), as well as the dynamics of mRNA and protein. These are subject to transcription, translation and degradation. The promoter can be inactivated by the presence of protein. The two dashes on the arrow for transcription denote the presence of a transcriptional delay τ.

Mathematically, our model comprises a set of chemical reactions between protein and mRNA molecules and their interactions with the promoter state. The number of each type of particle is discrete. We write $[eqn]$ for the number of protein molecules and $[eqn]$ for the number of mRNA molecules in a cell. They relate to their concentrations $[eqn]$ and $[eqn]$ via $[eqn]$ and $[eqn]$ , where $[eqn]$ is the nuclear volume. For convenience, we write $[eqn]$ , where we call $[eqn]$ the system size in line with chapter 10 of [30]. By varying the size of the nucleus, $[eqn]$ controls how many molecules may be expected in the system at constant concentrations. Throughout the paper, we will use parameter values identified in [46], and we choose $[eqn]$ , which is the average nuclear volume in that publication. We will measure concentrations in concentration units ‘cu’, which correspond to one molecule per volume $[eqn]$ .

Following [12,13], our model is

[eqn]

where $[eqn]$ and $[eqn]$ stand for the promoter states in which transcription of the gene are ON and OFF, respectively. We refer to the parameter $[eqn]$ in the first two lines in equation (2.1) as the bursting parameter. It controls the time scale of switching between the two promoter states. Note that the reaction rate from the promoter OFF into the ON state is in the literature often referred to as burst frequency, which is equal to $[eqn]$ in our parametrization [21,38]. $[eqn]$ is a so-called Hill coefficient. We refer to the fixed model parameter $[eqn]$ as the repression threshold, which controls the amount of protein required to inhibit transcription. The third and fourth reactions describe degradation of mRNA and protein, respectively, with degradation rates $[eqn]$ and $[eqn]$ . For convenience, we introduce the mathematical variable $[eqn]$ to indicate if the promoter is in the ON state ( $[eqn]$ ) or the OFF state ( $[eqn]$ ) at any one time $[eqn]$ . We draw attention to the rate $[eqn]$ for the penultimate reaction, which describes transcription. This reaction rate indicates that a transcription process can only be initiated at a given time $[eqn]$ if the promoter is in the ON state ( $[eqn]$ ). The rate of transcription is given by the basal transcription rate $[eqn]$ . Note that an alternative way to write the transcription reaction without the use of the $[eqn]$ variable would be

[eqn]

Here, we instead use the notation in equation (2.1), as the variable $[eqn]$ will be useful in our calculations.

The resulting mRNA molecule of a transcription event that is triggered at time $[eqn]$ enters the system at time $[eqn]$ . This delay is indicated by the double arrow in the reaction system and simulates the time required for transcription and subsequent transport of the mRNA out of the nucleus [45,47,48]. The final reaction describes the translation of mRNA into protein, with rate $[eqn]$ .

The number of reactions of each type that occur in each time interval is proportional to $[eqn]$ , $[eqn]$ or $[eqn]$ . As a consequence, the copy numbers of mRNA and protein particles in the system are proportional to either the system-size parameter $[eqn]$ . We can use this parameter to control the strength of copy-number noise in our analysis, as stochasticity in the rate of each reaction will decrease with the system size.

We highlight that our calculations are consistent with other mechanisms that upregulate the expected number of molecules in the system by a factor $[eqn]$ . In fact, biological cells can have vastly different expression levels for individual proteins without an increased size of the nucleus. Scaling by $[eqn]$ as in our system can be used to model such increases, although we point out that $[eqn]$ and $[eqn]$ in this case are not technically concentrations. Rather, increases in $[eqn]$ would correspond to an effective change of the unit of concentration towards higher density. In our model, that would be achieved by increasing the basal transcription rate and the repression threshold by a factor $[eqn]$ , as this would be mathematically equivalent to an increase in volume as before.

We chose the form of the transition rate into the promoter OFF state of $[eqn]$ . This reaction rate reflects the assumption that, biophysically, a promoter requires binding with $[eqn]$ protein molecules to switch into its OFF state. This can be seen as follows.

Considering a promoter that can bind $[eqn]$ protein molecules, the promoter may assume an unbound state $[eqn]$ and a bound state $[eqn]$ , where the latter corresponds to the state in which all $[eqn]$ binding sites are occupied. The reaction kinetics of promoter binding and unbinding are then as follows:

[eqn]

where $[eqn]$ and $[eqn]$ are binding constants. The left-hand side describes the unbound state and $[eqn]$ free protein molecules. On the right-hand side, all $[eqn]$ protein molecules are bound. In this reaction system, the rate of the reaction to the OFF state is given by

[eqn]

and that of the reaction to the ON state is given by

[eqn]

We now define $[eqn]$ and $[eqn]$ . Under these definitions, reaction (2.3) is equivalent to the first two reactions in equation (2.1). In doing so, we parametrize the promoter dynamics such that both transition rates between the OFF and ON states depend on $[eqn]$ . As a consequence, the mean probability to be in the ON or OFF state does not depend on $[eqn]$ , keeping the time-averaged transcription rate constant as we vary the bursting parameter $[eqn]$ . This allows us to isolate effects of promoter dynamics from those of changes in transcription rates.

Our model simplifies the limit of infinitely fast transcriptional bursting

2.2.

We next consider the limit of infinitely fast bursting parameter, $[eqn]$ . We write $[eqn]$ and $[eqn]$ for the probabilities to find the promoter in its unbound and bound states, respectively. Reflecting the limit $[eqn]$ , we assume that switching between the promoter states is sufficiently fast to reach equilibrium while a given concentration $[eqn]$ of protein molecules is present in the system. In this equilibrium of promoter binding and unbinding, we have

[eqn]

in addition to

[eqn]

Hence, in this stationary state, the probability of finding the promoter in its bound OFF state is

[eqn]

The probability of finding the promoter in the unbound ON state is

[eqn]

The average transcription rate is then given by

[eqn]

If the promoter switching is fast relative to the remaining reactions, i.e. when $[eqn]$ , fluctuations around this average value will diminish, and the system can be approximated as

[eqn]

The function $[eqn]$ is a decreasing function of the protein concentration $[eqn]$ , reflecting auto-repression and the degree of nonlinearity of the function, i.e. its sensitivity to changes in protein molecule numbers is regulated by the Hill coefficient $[eqn]$ . As before, the double arrow in equation (2.11) denotes the presence of the transcriptional delay $[eqn]$ . These reactions were previously introduced by [45], and they have since then been used widely as a template model for auto-repressive feedback [8,12,17,49]. Our model in figure 1 and equation (2.1) is intentionally designed so that the promoter behaves as in the widely used telegraph model for transcriptional bursting, while also achieving the typically used Hill function (2.10) for the transcription rate for the case of infinitely fast bursting. This allows us to analyse how the dynamics of the model (2.11) are modified if transcriptional bursting is taken into account. The description in terms of the Hill function is valid, provided the promoter switches between the ON and OFF states much more quickly than the remaining time scales in the reaction system.

A deterministic description of these dynamics is given by the following rate equations (see also [17,45]) for the concentrations $[eqn]$ and $[eqn]$ :

[eqn]

We note the delay term in the first equation. This description does not describe fluctuations due to finite copy numbers or transcriptional bursting and is formally only valid in the joint limit of $[eqn]$ and $[eqn]$ .

Limiting cases distinguish sources of noise in the system: copy-number noise versus noise due to transcriptional bursting

2.3.

We can distinguish between two types of randomness in the model in equation (2.1). The first is copy-number noise. By this we mean the intrinsic stochasticity in the protein and mRNA dynamics, described by the last four reactions in (2.1). This randomness is present even at fixed promoter state $[eqn]$ and becomes more pronounced when there are only small numbers of molecules in the system. More precisely, the amplitude of intrinsic noise scales as $[eqn]$ for large values of the system-size parameter $[eqn]$ ([30, chapter 10] or [15] and [16]). This is a consequence of the central limit theorem. Broadly, the mean number of reactions in the system per unit time is proportional to $[eqn]$ , and the standard deviations of the number of reactions is of order $[eqn]$ . Relative fluctuations therefore scale as $[eqn]$ .

The second type of noise arises from transcriptional bursting, induced by the random switching between the ON and OFF states in the first two reactions in (2.1). The promoter dynamics operate on time scales that are set by the bursting parameter $[eqn]$ . For increasingly pronounced timescale separation, i.e. for very fast transcriptional burst frequencies ( $[eqn]$ ), the transcriptional bursting effects average out, and thus the effects of transcriptional burstiness diminish. Using again the central limit theorem, the relative fluctuations due to transcriptional bursting scale as $[eqn]$ . The reactions in equation (2.1) describe the full model, for general values of $[eqn]$ and $[eqn]$ . This set-up therefore captures both copy-number noise and noise due to transcriptional bursting.

Having considered the limit $[eqn]$ , equation (2.11), in which only copy-number noise and no bursting noise is present, and the joint limit $[eqn]$ , equation (2.12) in which neither noise is present, it remains to cover the limit $[eqn]$ , in which bursting noise is present, but copy-number noise is not. In this limit, the dynamics of protein and mRNA may be described by ordinary differential equations. However, for general values of $[eqn]$ , the promoter dynamics remain stochastic. This leads to a description similar to what is known as a piecewise-deterministic Markov process [50,51]. Specifically, in this limit one has

[eqn]

for the dynamics of mRNA and protein. The switching of the promoter state is as before, that is, transitions from $[eqn]$ to $[eqn]$ occur with the bursting parameter $[eqn]$ , and the promoter switches off ( $[eqn]$ ) at time $[eqn]$ with rate $[eqn]$ . We refer to equation (2.13) as the equation for the piecewise-deterministic pseudo-Markov process. This approximation has recently been analysed in the stationary regime [52].

We note the term $[eqn]$ in the first equation in (2.13). This describes the process of mRNA production, subject to transcription delay. Production of mRNA molecules at time $[eqn]$ only occurs if the promoter was in the ON state ( $[eqn]$ ) at time $[eqn]$ . If on the other hand $[eqn]$ , then no new mRNA molecules enter the system at time $[eqn]$ .

The different descriptions of the system are summarized in table 1.

Results

Oscillatory gene expression can be induced and amplified by transcriptional bursting

3.1.

Our model (2.1) extends the widely used model of auto-negative feedback (2.11) to take transcriptional bursting into account. This allows us to ask how transcriptional bursting may affect the dynamics of the feedback loop. To this end, we simulate the full model (2.1), including bursting noise and copy-number noise, as well as the limiting cases with only one type of noise (equations (2.11) and (2.13)), and no noise (equation (2.12)). We choose experimentally informed parameter values for $[eqn]$ and $[eqn]$ , as well as the parameters $[eqn]$ and the transcription delay $[eqn]$ from mouse neural progenitor cells reported in [46] and detail our simulation algorithms in electronic supplementary material, §S3. As an initial example, we choose $[eqn]$ and bursting parameter $[eqn]$ min $[eqn]$ . Intriguingly, we find that the presence of transcriptional bursting can strongly enhance oscillations in the system (figure 2).

Oscillatory gene expression can be induced and amplified by transcriptional bursting. ‘Full model’ refers to equation (2.1), ‘bursting noise only’ refers to equation (2.13), valid in the limit Ω→∞ and λ remaining finite and ‘copy-number noise only’ refers to equation (2.11) for infinitely fast transcriptional burst frequencies (λ→∞, Ω finite). The deterministic model refers to equation (2.12), valid in the combined limit Ω→∞ and λ→∞. Parameters are according to figure 6e of [46]. In detail they are: αM = 39.93 cu min−1, αP=21.56 min−1, μm=(ln2/30) min−1, μp=(ln2/90) min−1, h=4.78, P0=24201.01 cu and τ=33 min. We use Ω=1 for the full model and the copy-number noise-only model, whereas the remaining models assume Ω=∞. Similarly, we use λ=0.1 min⁣−1 for the full model and the bursting noise-only model, where the remaining models assume λ=∞. Details of the computational implementation are available in electronic supplementary material, §S3.

Specifically, we plot in figure 2 the concentration of protein molecules $[eqn]$ as a function of time for each model. The dashed line in each panel of figure 2 is obtained in the absence of either type of noise, i.e. from the deterministic model in equation (2.12). As seen in the figure, the concentration of protein shows decaying oscillations and settles to a stable fixed point. Thus, without noise, there is no persistent oscillatory behaviour for this parameter set.

Considering simulations of the full model, comprising both copy-number noise and transcriptional bursting (equation (2.1)), we find that the time series shows persistent oscillations. Given that these oscillations are absent in the fully deterministic model (dashed line), the oscillatory behaviour must be noise-induced.

Next, we consider the piecewise-deterministic model obtained in the limit of infinite copy numbers (equation (2.13)), i.e. the model in which transcriptional bursting occurs, but copy-number noise is neglected (‘bursting noise only’ in figure 2). We again observe noise-induced oscillatory behaviour with a slightly lower amplitude than in the full model.

Finally, we consider the limit of infinitely fast bursting dynamics and finite copy numbers, as in equation (2.11) (‘only copy-number noise’ in figure 2). We again observe noise-induced oscillations. As these oscillations represent the result of an amplification of copy-number noise only, they are similar to what was reported in other cases of stochastic amplification [15–17,46,53]. Notably, the amplitude of the oscillations, in this case, is considerably lower than that in the full model or the ‘bursting noise only’ model, illustrating that the presence of transcriptional bursting changes the qualitative dynamics of the system by amplifying these oscillations.

Variation due to transcriptional bursting can be approximated with stochastic differential equations

3.2.

Our results in figure 2 indicate the importance of considering transcriptional bursting when investigating the dynamics of gene regulatory networks. Our analysis relied on simulations of the full reaction system in equation (2.1). These reactions allow us to simulate the system dynamics under the presence of transcriptional bursting and copy-number noise. They can be simulated using the Gillespie stochastic simulation algorithm [28,54] and mathematically described using the master equation formalism (see, e.g. [30, chapter 7]), or, in our case, variations of it that account for transcriptional delay [17]. However, master equations are notoriously hard to solve, and for analytical derivations, chemical Langevin equations are widely used [17,29]. Chemical Langevin equations have the added benefit of requiring orders of magnitude fewer calculations in each simulation than the Gillespie simulation algorithm.

Unfortunately, the chemical Langevin equation cannot readily be applied to systems with switching promoter states. Here, we build on the theory developed by [51] to demonstrate how the chemical Langevin equation formalism can be applied to incorporate Gaussian approximations of transcriptional bursting. This latter approximation will be valid in the limit of large but finite values of $[eqn]$ .

To derive an effective stochastic differential equation encompassing both copy number and bursting noise, we adapt the procedure first used by Gillespie in [29]. More precisely, we discretize time and then require that the time step is sufficiently small such that reaction rates and burst frequencies remain constant during each step. We simultaneously assume that the time step is large enough such that the number of reactions in each time step follows approximately a Gaussian distribution. These two assumptions can only be simultaneously fulfilled in system with large system sizes, $[eqn]$ , and large burst frequencies, $[eqn]$ . Under these assumptions, it is possible to analytically calculate the mean and variance of the Gaussian increments at each timestep, and the definition of a stochastic differential equation emerges. Our derivation is described in full detail in electronic supplementary material, §S1, where we provide formulas relating to model (2.1) as well as for more general gene regulatory networks.

We obtain the following extended chemical Langevin equations:

[eqn]

with Gaussian white noise variables $[eqn]$ and $[eqn]$ , each of mean zero, and with

[eqn]

We note two contributions to $[eqn]$ , one proportional to $[eqn]$ , and representing copy-number noise, and a second proportional to $[eqn]$ , representing noise due to transcriptional bursting. Protein production and decay are not dependent on the promoter state, and hence the noise variable $[eqn]$ originates purely from copy-number noise. Its variance is therefore proportional to $[eqn]$ . The last expression in equation (3.2) indicates that there are no correlations between $[eqn]$ and $[eqn]$ . This is because no reaction in equation (2.1) changes both the number of protein molecules and that of mRNA molecules.

The extended chemical Langevin equation (3.1) provides a possible approximation of the full model in equation (2.1). The dynamics of the chemical Langevin equation are dictated by two coupled stochastic differential equations. This is a drastic simplification of the full reaction system, which is described by an extended chemical master equation that accounts for the presence of a delay [17,55]. Mathematically, this chemical master equation constitutes an infinitely large set of coupled ordinary differential equations [30, chapters 5 and 7]. Nonetheless, the chemical Langevin equation is nonlinear, making it analytically intractable. An analytically tractable equation can be obtained by linearizing. This involves one further step of approximation and results in the so-called linear-noise approximation.

Our formulation of the linear-noise approximation applies to parameters for which the deterministic system in equation (2.12) tends to a stable fixed point $[eqn]$ . We linearize about this fixed point by writing

[eqn]

We note that $[eqn]$ and $[eqn]$ describe fluctuations of particle concentrations (as opposed to fluctuations of particle numbers) about the deterministic fixed point.

Assuming that $[eqn]$ and $[eqn]$ are small quantities, we then obtain the following linear stochastic delay differential equations:

[eqn]

where

[eqn]

A key element of the linear-noise approximation is to replace the second moments of the noise variables in the chemical Langevin equation, equation (3.2), with the values of these expressions at the deterministic fixed point. This leads to Gaussian white noise variables $[eqn]$ and $[eqn]$ with properties

[eqn]

where

[eqn]

Stochastic differential equations accurately describe protein dynamics

3.3.

As an initial test of this approximation, we plot individual trajectories of protein concentration $[eqn]$ for the full model (equation (2.1)), the chemical Langevin equation (3.1) and the linear-noise approximation (3.4) in figure 3A. We choose large values of $[eqn]$ and $[eqn]$ for this test, as that is the regime for which we expect the approximation to work well. Specifically, we use $[eqn]$ and $[eqn]$ min $[eqn]$ . All three models exhibit oscillations, indicating that the chemical Langevin equation can capture the effect of noise-induced cycles. Promisingly, all three trajectories appear to have similar qualitative properties, such as the period and amplitude of the oscillations. The potential benefit of using the chemical Langevin equation becomes immediately obvious when comparing computation times: using our implementation, generating trajectories of the chemical Langevin equation for this parameter combination is 463 times faster than generating trajectories of the full model.

Stochastic differential equations accurately describe protein dynamics. For the following models, we show individual trajectories in (A) and power spectra in (B). The full model is a Gillespie simulation of equation (2.1), ‘CLE’ stands for chemical Langevin equation (see equation (3.1)), and LNA stands for ‘linear-noise approximation’ (equation (3.4)). ‘LNA theory’ refers to the analytically obtained power spectrum in equation (3.8). The parameters used are the same as in figure 2, except that here Ω=100 and λ=10min−1. All three models exhibit noise-induced oscillations and their power spectra match. Note that ω represents angular frequencies. To obtain the power spectra, 100 trajectories of each model are generated. They are simulated for T = 10 000 min, and the first 200 min are discarded to account for equilibration. Further details of the computational implementation are available in electronic supplementary material, §S3.

While this visual comparison of trajectories is promising, our next aim must be to evaluate quantitatively how well the two models compare. An ideal measure for this comparison would be one that can characterize the properties of the dynamics visible in figure 3A. Such a measure is available in the form of the Fourier power spectrum of the protein time series. Fourier analysis is a standard method to decompose a time series (such as those in figure 3A) into contributions from oscillatory signals of varying frequencies. The power spectrum summarizes this analysis and captures, for each angular frequency $[eqn]$ , the square of the amplitude, i.e. the strength, of this contribution. Conveniently, the power spectrum of the linear-noise approximation equation (3.4) can be analytically calculated.

We conduct this calculation in the electronic supplementary material, §S1.6. We find

[eqn]

The denominator in these expressions is

[eqn]

For the full model and the chemical Langevin equation, the power spectrum can instead be obtained numerically from many repeated simulation runs.

Comparing the protein power spectra for all three models for the same parameters as before (i.e. figure 6e from [46] and $[eqn]$ min $[eqn]$ , $[eqn]$ ) reveals that all three models have nearly identical spectra (figure 3B), highlighting the accuracy of our approximation and demonstrating that our analytical derivations are valid. Hence, the oscillations of the full model in figure 3 are well described by the analytically obtained power spectrum equation (3.8).

Approximation by stochastic differential equations breaks down for small burst frequencies

3.4.

Our derivation of the extended chemical Langevin equation (3.1) made the assumption that the bursting parameter, $[eqn]$ , and the system size, $[eqn]$ , are large. But what exactly does large mean? Or, in other words, at what values of $[eqn]$ or $[eqn]$ will the approximation by stochastic differential equations break down? To answer this question, we next seek to computationally estimate this limit by analysing the standard deviations of time series of protein concentration,

[eqn]

where $[eqn]$ stands for a time average in the stationary state, i.e. after long waiting times. This quantity is a measure of the amplitude of noise-induced oscillations about a deterministic fixed point. Where the power spectrum is analytically available, the standard deviations can be calculated from the power spectrum $[eqn]$ via the relation

[eqn]

A derivation of this relationship is provided in the electronic supplementary material, §S1.7. We can use equation (3.8) to analytically predict $[eqn]$ from the linear-noise approximation.

We analyse how the standard deviation in protein time series of each model, i.e. the full model (2.1), the chemical Langevin equation (3.1) and the linear-noise approximation equation (3.4), depends on the bursting parameter $[eqn]$ for a fixed system size of $[eqn]$ (figure 4A). The theory derived within the linear-noise approximation captures the standard deviations $[eqn]$ of the full model quantitatively for sufficiently large burst frequencies $[eqn]$ . When bursting is slow on the other hand, significant deviations between the linear-noise approximation and simulations of the full model are observed.

Approximation by stochastic differential equations breaks down for small burst frequencies. (A) standard deviations of the concentration of protein molecules, ΣP, in the stationary state as a function of the bursting parameter, λ, for multiple models. The labels ‘Full model’, ‘CLE’ and ‘LNA’ are defined in figure 3. Additionally, we include the analytically calculated standard deviation from equation (3.11) as ‘LNA theory’. (B) The relative error rΣ according to equation (3.12) for all approximations considered in (A). Parameters are the same as in figure 2, with Ω=100. For values of λ with λ≤0.1, two trajectories are created to calculate ΣP and rΣ. For values of λ with λ>0.1, 20 trajectories are used. Each trajectory is simulated for T=105 min, and the first 2000 min are discarded. Further details of the computational implementation are available in electronic supplementary material, §S3.

To be able to accurately identify the point of deviation between the models, we define the relative error to the full model

[eqn]

where $[eqn]$ indicates the protein standard deviations of the model in question (the chemical Langevin equation (3.1), the linear-noise approximation in equation (3.4) or the theoretical result using equations (3.11) and (3.8)) and $[eqn]$ refers to the protein standard deviation of the full model (2.1).

The linear-noise approximation becomes accurate to at most a $[eqn]$ error in this example roughly when bursts occur at least once per minute ( $[eqn]$ ; figure 4B). Even for much slower burst frequencies, the approximation produces reasonable ball park estimates for the magnitude of fluctuations (and hence for the amplitude of noise-induced oscillations). The relative error between the linear-noise approximation and the full model remains below one for burst frequencies on a time scale of $[eqn]$ . That is, the linear-noise approximation produces estimates that are within a factor of two of the true standard deviations for burst frequencies with time scales of approximately an hour.

Figure 4A also shows that the standard deviations $[eqn]$ reduces for increased burst frequencies $[eqn]$ . This is because the strength of bursting noise scales as $[eqn]$ (to leading order). Thus, for high values of $[eqn]$ , the promoter’s ON/OFF dynamics average out. Fluctuations induced by bursting then become less pronounced in the dynamics of the gene regulatory network. We would not expect $[eqn]$ to approach zero in the limit of infinitely fast bursting ( $[eqn]$ ). This is because the effects of intrinsic noise remain in this limit, so we expect oscillations purely induced by copy-number noise, as observed for example in figure 2 for the model with only copy-number noise.

The system-size parameter is held fixed at $[eqn]$ in figure 4. For very slow bursting dynamics (small values of $[eqn]$ ), we then expect bursting noise to be much stronger than copy-number noise. Oscillations are then predominantly induced by bursting noise. Within the linear-noise approximation, their amplitude can be expected to scale as $[eqn]$ , and the contributions containing $[eqn]$ in equations (3.7) and (3.8) become irrelevant. This is indeed what we see in figure 4A.

Suitability of our approximation is application dependent

3.5.

Our analysis of the breakdown of the approximation in figure 4 assumed that $[eqn]$ , i.e. we only identified a critical value of $[eqn]$ . However, our derivation of the chemical Langevin equation (3.1) required us to assume that both $[eqn]$ and $[eqn]$ are large. Hence, we would expect the accuracy of the extended chemical Langevin equation to depend on both $[eqn]$ and $[eqn]$ , with the possibility that, for each $[eqn]$ , different values of $[eqn]$ are required for accuracy of the approximation.

We analyse the joint dependence of the accuracy of the chemical Langevin equation (3.1) on $[eqn]$ and $[eqn]$ in figure 5, again focusing on our measure of relative error equation (3.12). We find that the predictions from the chemical Langevin equation reproduce the standard deviations of protein fluctuations up to an error of at most $[eqn]$ when $[eqn]$ and $[eqn]$ (figure 5). In the figure, all other parameters are as in figures 2–4, matching those in [46].

Relative error rΣ of standard deviations of the chemical Langevin equation (3.1) compared to the full model in equation (2.1). The relative error is calculated using equation (3.12). The black cross represents the grid point λ=0.1 min⁣−1 and Ω=1. Parameters are the same as in figure 2. For grid points where λ≤0.1 min⁣−1, two trajectories are created to calculate rΣ. For grid points where λ>0.1 min⁣−1, 20 trajectories are created. Each trajectory is simulated for 105 min, and the first 2000 min are discarded. Details of the computational implementation are available in electronic supplementary material, §S3.

While our choice of $[eqn]$ cu together with a value of $[eqn]$ is consistent with experiments [46], the requirement $[eqn]$ indicates that the bursts would have to occur once every 2 min or faster for the chemical Langevin equation to become accurate to an error of less than $[eqn]$ . That leaves us to ask in which regime experimentally measured values for the burst parameter $[eqn]$ lie. In our model equation (2.1), $[eqn]$ represents the rate at which the promoter switches from the OFF to the ON state. This rate varies widely between individual genes, and between organisms. For example, live-imaging in yeast suggested $[eqn]$ of 1−4 min $[eqn]$ [36], whereas measurements on a range of bacterial genes found 0.3 min $[eqn]$ [37]. Measurements in the fruit fly Drosophila indicate values of around 0.15 min $[eqn]$ by [56] and up to 2 min $[eqn]$ by [57]. Bursting in mammalian cells is considered to be slower than in other organisms, with values of around 0.01 min $[eqn]$ reported for a range of genes by [38]. Yet, for HIV-1 RNA in HeLa cells, bursting frequencies of up to 0.6 min $[eqn]$ have been observed [58].

Considering these experimental measurements, we investigate the representative value $[eqn]$ , $[eqn]$ in figure 5. At this parameter combination, protein fluctuations in the extended chemical Langevin model deviate from that in the full model by approximately $[eqn]$ (figure 5). For larger $[eqn]$ , the approximation becomes more accurate; for smaller $[eqn]$ , the approximation eventually breaks down. This illustrates that the applicability of our method will depend on the biological scenario: in some cases, an approximation using our extended chemical Langevin equation will work well. In other situations, it will be more suitable to use a piecewise continuous chemical Langevin equation, i.e. a model following Gillespie’s derivation to approximate copy-number noise [29], but simulating promoter switching explicitly, such as in equation (2.13). For small system sizes $[eqn]$ and small $[eqn]$ , it will be necessary to use the full model, equation (2.1). This is a workable approach, as stochastic simulations using the Gillespie algorithm will be fast in this scenario.

The breaking point of the approximation in our model is at around $[eqn]$ min $[eqn]$ . This breaking point may be different for other transcription factor networks: the timescale of fluctuations of gene regulatory networks is typically set by degradation rates [26, chapters 1−3], and one may expect that the applicability of our Gaussian approximation depends on how the bursting parameter $[eqn]$ relates to degradation rate of the mRNA that is being produced. For more long-lived mRNAs, such as those involved in the circadian clock, slower bursting frequencies may be permissive in our approximation. In a given application, it will hence be beneficial to test the applicability of the chemical Langevin approximation through comparison to test simulations of the corresponding full, unapproximated model.

We expect that the requirement of sufficiently large $[eqn]$ is met for a large number of genes. Specifically, we expect $[eqn]$ to be sufficiently large if the number of protein molecules per nucleus exceeds a few hundred to a few thousand. Our figure 5 illustrates this, as the relative error falls below 5% as $[eqn]$ , which would correspond to around 20 000 protein molecules per nucleus. More than half of all quantified genes in a global screen conducted by [59] fall within that range.

The chemical Langevin equation we derived assumes Gaussian increments in mRNA and protein copy number between consecutive time intervals. In cases where the approximation breaks down when the bursting parameter $[eqn]$ is low, we expect increments to be non-Gaussian, suggesting that slow transcriptional bursting is an inherently non-Gaussian effect. This is consistent with experimental observations that mRNA distributions can be non-Gaussian and non-Poissonian [32].

Stochastic differential equation can approximate other gene regulatory networks

3.6.

In equation (2.1), we chose the rate $[eqn]$ to depend on the current protein concentration $[eqn]$ . This follows a model for repressive transcription factors initially proposed by [26] (see appendix A of the reference) and is designed to generate the typical Hill function (2.10) in the stationary limit. However, alternative functional implementations of the action of a transcription factor are possible. For example, multiple recent studies suggest that transcriptional repressors affect the rate of $[eqn]$ transitions, thus modulating the waiting times in the OFF period (e.g. [60]). Our framework of the chemical Langevin equations to approximate the effect of promoter switching is suitable for such alternative parametrizations, and we illustrate this in electronic supplementary material, §S2 and figures S1 and S2.

Similarly, our method is applicable to other gene regulatory networks. A seminal example of such a gene network is the bistable toggle switch, which has been successfully engineered into Escherichia coli bacteria [61]. The standard model for this switch considers two protein concentrations without representing mRNAs or delays. The two protein concentrations mutually repress each other and can each be degraded. This model can be extended to account for multiple promoter states (figure 6); see [62]. Choosing a promoter parametrization for both components of the switch inspired by equation (2.1), we arrive at the chemical reaction system

Schematic of the toggle-switch model. The model describes two protein concentrations A and B and the effects of separate promoters for each. These can be in an active or an inactive state (on/off). Both proteins can be degraded. The promoter of each protein can be inactivated by the presence of the respective other protein.

[eqn]

where $[eqn]$ represents promoter ON states for proteins A or B, respectively, and $[eqn]$ represents the promoter OFF states. The bursting parameter $[eqn]$ regulates the burst frequency and duration in the ON states of both promoters. Each promoter experiences the repression by the respective other protein, with the same repression threshold $[eqn]$ and the same Hill coefficient $[eqn]$ . Both proteins are degraded with degradation rate $[eqn]$ , and in the respective ON states of the promoter, the proteins are produced by the transcription rate $[eqn]$ . The auxiliary variable $[eqn]$ is equal to one if the A promoter is in the ON state $[eqn]$ and equal to zero if the promoter is in the OFF state $[eqn]$ . Similarly, $[eqn]$ takes the values 1 and 0 depending on the state of the B promoter. We have chosen both proteins to share the same parameters for production, degradation and repression for simplicity. The concentrations A and B, are given by $[eqn]$ and $[eqn]$ , respectively, where $[eqn]$ and $[eqn]$ are the number of $[eqn]$ and $[eqn]$ molecules and $[eqn]$ refers to the nuclear volume. For convenience, we again write $[eqn]$ , with $[eqn]$ determining the volume of a reference nucleus. Concentrations are again measured in cu, referring to one molecule per volume $[eqn]$ .

Intriguingly, this system can exhibit transitions between two steady states that are enabled by bursting noise. Specifically, for the parameter combination of $[eqn]$ cu, $[eqn]$ , $[eqn]$ min $[eqn]$ , $[eqn]$ cu min^−1^ and $[eqn]$ , we see bistability for $[eqn]$ min $[eqn]$ . Starting from an initial condition of the concentrations $[eqn]$ cu and $[eqn]$ cu, we find that the concentration of $[eqn]$ molecules approaches a high steady state, fluctuating around the value of approximately 9 cu (figure 7A), whereas the concentration of A fluctuations around a lower value of approximately 1 cu. Swapped stationary concentrations for $[eqn]$ and $[eqn]$ occur in different realizations of the process. Using this same parameter combination but with $[eqn]$ min $[eqn]$ , stochastic transitions can be observed, where each of the concentrations $[eqn]$ and $[eqn]$ is low during some periods and high during others (figure 7B). The parameter combinations used in figure 7A,B thus identify a regime in which promoter bursting is functionally important. Therefore we test our approximation by chemical Langevin equations for this parameter combination.

Noise-induced transitions in the toggle-switch model. (A) A representative simulation of the toggle-switch model in equation (3.13) using the parameter combination P0=3 cu, h=2, μ=0.1 min⁣−1, α=1 cu min−1, λ=100 min⁣−1 and Ω=100, simulated using the Gillespie stochastic simulation algorithm [28]. (B) A representative simulation of the toggle-switch model equation (3.13) using the same parameters as in (A), except that now λ=1 min⁣−1. (C) A representative simulation of the chemical Langevin equation for the toggle-switch model, equation (3.14), using the same parameters as in (A), simulated using an Euler–Maruyama scheme and a timestep of 0.01 min. The dynamics are similar to (A), except that the stationary concentration profiles of A and B are swapped. (D) A representative simulation of the chemical Langevin equation for the toggle-switch model, equation (3.14), using the same parameters as in (B).

The chemical Langevin approximation of this system emerges as

[eqn]

with Gaussian white noise variables $[eqn]$ and $[eqn]$ , each of mean zero, and with

[eqn]

Indeed, we find that the chemical Langevin equations approximate the dynamics of the toggle-switch well. Protein concentrations simulated by the chemical Langevin equation (3.14) fluctuate on a timescale comparable to those simulated by the Gillespie algorithm, and have comparable amplitude (compare figure 7C,D with 7A,B, respectively). These simulations by the chemical Langevin equation also capture the noise-induced transitions between the two states of the model when comparing the cases $[eqn]$ min $[eqn]$ and $[eqn]$ min $[eqn]$ (figure 7C,D).

Importantly, the stationary distribution obtained from Gillespie simulations of the full model equation (3.13) is well approximated by the chemical Langevin equation (3.14) (figure 8A). The means and standard deviations of both distributions are within <0.5% of each other. Similarly, the distribution of waiting times in each of the stable states is well approximated by the chemical Langevin equation (figure 8B). The means and standard deviations of the waiting time distribution sampled from the chemical Langevin equation differ by less than 10% from the distribution generated using the Gillespie stochastic simulation algorithm (for approx. 11 000 sampled waiting times).

Chemical Langevin equations approximate the toggle-switch dynamics well. (A) Stationary distribution of the concentration of A molecules for simulations of the Gillespie stochastic simulation algorithm in equation (3.13), and the chemical Langevin equation (3.14). (B) Distribution of waiting times to switch from the configuration (A high, B low), to (A low, B high), for both models. For both panels, the same parameter combination as in figure 7B was used. Both panels analyse simulation output from one simulation of each model that was run for a duration of T=107 min. The chemical Langevin equation was simulated using an Euler–Maruyama scheme with a timestep of 1 min. To calculate waiting times, each time series was smoothened by a sliding average of 1000 min. Then, the length of any time window was recorded in which the smoothened time series of the A concentration was greater than 4 cu, and where the smoothened concentration of B was less than 4 cu. The histogram is generated from all waiting times recorded in this way.

In summary, our chemical Langevin approximation is applicable and accurate for other gene regulatory networks, following the theory provided in electronic supplementary material, §S1.

Summary and discussion

The importance of noise in gene regulatory networks is widely recognized [9,20,32]. Here, we investigated how noise due to transcriptional bursting affects the dynamics of the broadly studied auto-negative feedback motif and found that bursting can induce oscillations when they are not otherwise present and enhance oscillations that may already be present in the system. We provided computationally efficient approximations of the system dynamics and illustrated the utility of our approximation through analytical derivations of the power spectrum of the observed oscillations. Simulations of the chemical Langevin equation are multiple orders of magnitude faster than those of the full system. Our calculations confirm mathematically that transcriptional bursting alone can induce oscillations and that transcriptional bursting can enhance the amplitude of oscillations if these are already present due to copy-number noise. We investigated the breaking point of our extended chemical Langevin equations and related it to biological measurements of transcriptional burst frequencies and showed how our theory can be extended to other gene regulatory networks.

Our models considered a single active promoter for each gene. Our approach is nonetheless applicable to systems with more active promoters, i.e. genes with more than one active allele in haploid or even multiploid cells [33,63,64]. This would be implemented by assuming more transcriptional states, representing the different combinations of promotor activity on each allele. We further highlight that the observed oscillations in our model are only visible in the stationary regime, i.e. when a gene is constitutively expressed, and cannot be used to describe genes that recently turned on, for example.

Varying definitions of burstiness exist in the literature. The concept was originally introduced to describe mRNA production that is discontinuous in time [20,65]. This is the phenomenon we seek to study here. Other authors exclusively consider the gene to undergo transcriptional bursting if the OFF periods of the promoter are much longer than the ON periods [21,66,67]. Our approximations are applicable under both of these definitions of burstiness, i.e. they can describe systems where OFF periods are much longer than ON periods, but this is not a necessary restriction for the validity of our methods. We are instead focused on the consequences of the random switching between the OFF and the ON states of the promoter, rather than their relative duration.

In the language of transcriptional bursting, our bursting parameter $[eqn]$ affects the duration of ON periods and of OFF periods of the promoter and, therefore, the bursting initiation rate as well as the burst size, i.e. the number of transcripts per transcriptional burst. This dependency is chosen such that, when varying the bursting parameter $[eqn]$ , we vary the timescale of bursting, but not the expected relative durations of ON and OFF periods. This joint dependency of both rates on $[eqn]$ is necessary to ensure that our changes in promoter dynamics do not affect the average transcription rate of the promoter, which is known to impact the oscillatory properties of the model [45].

Following these design choices, the limits $[eqn]$ and $[eqn]$ allow us to construct situations in which either copy-number noise or promotor-bursting noise are ‘turned off’. Being able to do this is a distinct advantage of the model, as it means we can investigate the consequences of each type of noise individually. Identifying protocols to perform analogous investigations experimentally would be challenging and may not be possible with existing technologies. For example, to change the system-size parameter, the experiment of interest would require (i) an increased production of mRNA, (ii) a decreased sensitivity of the promoter to the number of protein molecules and (iii) maintaining an identical burst frequency while doing so. Increased production of mRNA (i) may be achieved through upstream regulation by a relevant transcription factor. However, that may also affect bursting kinetics. Changing the sensitivity to protein may be achieved through binding of protein molecules via dimerization or modifying concentrations of a transcriptional co-regulator. Both of these would need to be done in a way that does not affect the bursting kinetics. Similarly, changing $[eqn]$ would require changing the bursting kinetics in a way that affects both the burst frequency as well as the burst size, which would possibly require a modification of the promoter or enhancer sequences, as well as a change in upstream regulation. In summary, the model allows us to achieve understanding that would be difficult to obtain by experiments alone.

Our investigation continues a long line of research on the delayed auto-negative feedback motif [8,12,17,26,45–47,55,68–77]. Among this work, [68] pointed out that the model presented in equation (2.12) is insufficient to explain the full amplitude of biologically observed oscillations, and they hence argued that alterations to the model, such as the consideration of protein dimerization or periodic forcing, are necessary to alleviate this discrepancy. Our results suggest that, instead, the presence of transcriptional bursting may be sufficient to explain an increase in oscillation amplitude, a result in line with previous findings on a bursting model with negative feedback [78]. We expect the auto-negative feedback motif to remain relevant as auto-repression is abundant in gene regulatory networks, suggesting the existence of yet unknown oscillators [79].

Our work contributes to existing efforts to efficiently simulate the dynamics of gene regulatory networks, such as the methods presented by [80]. Simulating chemical Langevin equations is generally much less demanding in terms of computing time than the simulation of the full reaction system from which the chemical Langevin equation is derived. Where our approximation is applicable, this means that equations such as equation (3.1) are a viable approach to studying the role of transcriptional bursting in natural gene regulatory networks. For example, high-throughput simulations of the chemical Langevin model can be used to infer model parameters from experimental data [69,81].

When deriving our extended chemical Langevin equation in electronic supplementary material, §S1, we introduce the promoter states as environmental states in a chemical reaction system. The response of chemical reaction systems to rapidly switching environments in this way was previously derived in [51,82]. This work also included levels of approximation for large copy numbers or fast environmental switching rates, respectively. In these studies, the authors used a Kramers–Moyal expansion to arrive at an extended chemical Langevin equation. Our analysis expands this previous work by providing an alternative derivation of the extended chemical Langevin equation in the style of [29]. This derivation is advantageous to us over the Kramers–Moyal expansion, as it allows a natural inclusion of delayed terms. Without this inclusion of delayed terms, our discussion of the delayed auto-negative feedback motif would not have been possible.

Our work opens up possibilities in a wider range of areas. Copy-number fluctuations are known to generate oscillations not only in gene regulatory circuits but also in other biochemical systems, in models of epidemic spread and in evolutionary game theory [16,83,84]. Such stochasticity, sometimes called ‘demographic noise’ in other areas, can also provoke spatial patterns or waves [85–87]. It is conceivable that these phenomena can also be induced by extrinsic noise or be enhanced by it. Further work is required to investigate this possibility.

In the context of gene regulatory networks, we hope that our discussion of the extended chemical Langevin equation, as well as the accompanying approximations for large or finite burst frequencies or system sizes, will facilitate the wider study of gene regulation and cell fate.

Bibliography89

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Davidson EH. 2010 Emerging properties of animal gene regulatory networks. Nature 468, 911–920. (10.1038/nature 09645)21164479 PMC 3967874 · doi ↗ · pubmed ↗
2Patke A, Young MW, Axelrod S. 2020 Molecular mechanisms and physiological importance of circadian rhythms. Nat. Rev. Mol. Cell Biol. 21, 67–84. (10.1038/s 41580-019-0179-2)31768006 · doi ↗ · pubmed ↗
3Foster RG. 2020 Sleep, circadian rhythms and health. Interface Focus 10, 20190098. (10.1098/rsfs.2019.0098)32382406 PMC 7202392 · doi ↗ · pubmed ↗
4Papagiannakis A, Niebel B, Wit EC, Heinemann M. 2017 Autonomous metabolic oscillations robustly gate the early and late cell cycle. Mol. Cell 65, 285–295. (10.1016/j.molcel.2016.11.018)27989441 · doi ↗ · pubmed ↗
5Kalsbeek A, Fliers E. 2013 Daily regulation of hormone profiles. In Circadian clocks handbook of experimental pharmacology (eds A Kramer, M Merrow), pp. 185–226. Berlin, German: Springer. (10.1007/978-3-642-25950-0_8)23604480 · doi ↗ · pubmed ↗
6Miao Y, Pourquié O. 2024 Cellular and molecular control of vertebrate somitogenesis. Nat. Rev. Mol. Cell Biol. 25, 517–533. (10.1038/s 41580-024-00709-z)38418851 PMC 11694818 · doi ↗ · pubmed ↗
7Kageyama R, Niwa Y, Shimojo H, Kobayashi T, Ohtsuka T. 2010 Ultradian oscillations in Notch signaling regulate dynamic biological events. In Current topics in developmental biology notch signaling, pp. 311–331. San Diego, CA, USA: Academic Publishing. (10.1016/s 0070-2153(10)92010-3)20816400 · doi ↗ · pubmed ↗
8Phillips NE et al. 2016 Stochasticity in the mi R-9/Hes 1 oscillatory network can account for clonal heterogeneity in the timing of differentiation. e Life 5, e 16118. (10.7554/elife.16118)27700985 PMC 5050025 · doi ↗ · pubmed ↗