Optimal information transfer in enzymatic networks: A field theoretic   formulation

Himadri S. Samanta; Michael Hinczewski; D. Thirumalai

arXiv:1704.07013·cond-mat.stat-mech·July 19, 2017

Optimal information transfer in enzymatic networks: A field theoretic formulation

Himadri S. Samanta, Michael Hinczewski, D. Thirumalai

PDF

TL;DR

This paper develops a field-theoretic approach to quantify and optimize signal transmission accuracy in enzymatic networks, revealing how noise affects information flow and identifying conditions for minimal error.

Contribution

The authors introduce a general field-theoretic framework for analyzing stochastic enzymatic signaling networks, recovering known results and extending to complex cascades with optimal delay conditions.

Findings

01

Exact error calculations for simple push-pull networks.

02

Optimal information transfer occurs with a specific time delay.

03

Second order corrections improve agreement with simulations.

Abstract

Signaling in enzymatic networks is typically triggered by environmental fluctuations, resulting in a series of stochastic chemical reactions, leading to corruption of the signal by noise. For example, information flow is initiated by binding of extracellular ligands to receptors, which is transmitted through a {cascade involving} kinase-phosphatase stochastic chemical reactions. For a class of such networks, we develop a general field-theoretic approach in order to calculate the error in signal transmission as a function of an appropriate control variable. Application of the theory to a simple push-pull network, a module in the kinase-phosphatase cascade, recovers the exact results for error in signal transmission previously obtained using umbral calculus (Phys. Rev. X., {\bf 4}, 041017 (2014)). We illustrate the generality of the theory by studying the minimal errors in noise reduction…

Equations119

\frac{d I}{d t} = F - γ_{I} I + η_{I}, \frac{d O}{d t} = R (I) - γ_{O} O + η_{O},

\frac{d I}{d t} = F - γ_{I} I + η_{I}, \frac{d O}{d t} = R (I) - γ_{O} O + η_{O},

δ I (t)

δ I (t)

δ O (t)

C_{cs} (t) = \int_{- \infty}^{t} d t^{'} H_{W K} (t - t^{'}) C_{cc} (t^{'}), t ⟩ 0

C_{cs} (t) = \int_{- \infty}^{t} d t^{'} H_{W K} (t - t^{'}) C_{cc} (t^{'}), t ⟩ 0

γ_{O} = γ_{I} 1 + Λ, G = \frac{R _{1}}{γ _{I} ( 1 + Λ - 1 )},

γ_{O} = γ_{I} 1 + Λ, G = \frac{R _{1}}{γ _{I} ( 1 + Λ - 1 )},

E_{W K} = \frac{2}{1 + 1 + Λ}, Λ \equiv \frac{R _{1}^{2}}{R _{0} γ _{I}} .

E_{W K} = \frac{2}{1 + 1 + Λ}, Λ \equiv \frac{R _{1}^{2}}{R _{0} γ _{I}} .

R (I) = n = 0 \sum \infty σ_{n} v_{n} (I),

R (I) = n = 0 \sum \infty σ_{n} v_{n} (I),

E = 1 - \frac{I ˉ γ _{O}^{2} σ _{1}^{2}}{( γ _{I} + γ _{O} ) ^{2}} [γ_{O} σ_{0} + n = 1 \sum \infty σ_{n}^{2} \frac{n ! γ _{O} I ˉ ^{n}}{γ _{O} + n γ _{I}}]^{- 1} .

E = 1 - \frac{I ˉ γ _{O}^{2} σ _{1}^{2}}{( γ _{I} + γ _{O} ) ^{2}} [γ_{O} σ_{0} + n = 1 \sum \infty σ_{n}^{2} \frac{n ! γ _{O} I ˉ ^{n}}{γ _{O} + n γ _{I}}]^{- 1} .

E \geq E_{o pt} \equiv \frac{2}{1 + 1 + Λ ~},

E \geq E_{o pt} \equiv \frac{2}{1 + 1 + Λ ~},

\frac{d P ( I _{i} , O _{i} , t )}{d t}

\frac{d P ( I _{i} , O _{i} , t )}{d t}

∣ ψ (t)⟩ = {I_{i}, O_{j}} \sum \infty P ({I_{i}}, {O_{j}}, t) Π_{{i, j}} a_{i}^{†}^{I_{i}} b_{i}^{†}^{O_{j}} ∣0 ⟩ .

∣ ψ (t)⟩ = {I_{i}, O_{j}} \sum \infty P ({I_{i}}, {O_{j}}, t) Π_{{i, j}} a_{i}^{†}^{I_{i}} b_{i}^{†}^{O_{j}} ∣0 ⟩ .

\frac{d}{d t} ∣ ψ (t)⟩ = - H ({a}, {a^{†}}; {b}, {b^{†}}) ∣ ψ (t)⟩ .

\frac{d}{d t} ∣ ψ (t)⟩ = - H ({a}, {a^{†}}; {b}, {b^{†}}) ∣ ψ (t)⟩ .

∣ ψ (t)⟩ = I_{i}, O_{i} \sum \infty P (I_{i}, O_{i}, t) a_{i}^{†}^{I_{i}} b_{i}^{†}^{O_{i}} ∣0 ⟩,

∣ ψ (t)⟩ = I_{i}, O_{i} \sum \infty P (I_{i}, O_{i}, t) a_{i}^{†}^{I_{i}} b_{i}^{†}^{O_{i}} ∣0 ⟩,

γ_{I} I_{i}, O_{i} \sum \infty [(I_{i} + 1) P (I_{i} + 1, O_{i}) - I_{i} P (I_{i}, O_{i})] a_{i}^{†}^{I_{i}} b_{i}^{†}^{O_{i}} ∣0 ⟩

γ_{I} I_{i}, O_{i} \sum \infty [(I_{i} + 1) P (I_{i} + 1, O_{i}) - I_{i} P (I_{i}, O_{i})] a_{i}^{†}^{I_{i}} b_{i}^{†}^{O_{i}} ∣0 ⟩

= γ_{I} I_{i}, O_{i} \sum \infty [P (I_{i} + 1, O_{i}) a_{i} a_{i}^{†}^{I_{i} + 1} b_{i}^{†}^{O_{i}} ∣0 ⟩ - P (I_{i}, O_{i}) a_{i}^{†} a_{i} a_{i}^{†}^{I_{i}} b_{i}^{†}^{O_{i}} ∣0 ⟩] .

H

H

P ({I_{i}}, {O_{i}}; 0) = Π_{i} P_{0} (I_{i}) P_{0} (O_{i}) = Π_{i} e^{- \overset{ˉ}{I}_{0}} e^{- \overset{ˉ}{O}_{0}} O_{0}^{- O_{i}} I_{0}^{- I_{i}} / I_{i}! O_{i}! .

P ({I_{i}}, {O_{i}}; 0) = Π_{i} P_{0} (I_{i}) P_{0} (O_{i}) = Π_{i} e^{- \overset{ˉ}{I}_{0}} e^{- \overset{ˉ}{O}_{0}} O_{0}^{- O_{i}} I_{0}^{- I_{i}} / I_{i}! O_{i}! .

∣ ψ (t)⟩ = e^{H t} ∣ ψ (0)⟩,

∣ ψ (t)⟩ = e^{H t} ∣ ψ (0)⟩,

\textless A (t)⟩ = {I_{i}}, {O_{i}} \sum A ({I_{i}}, {O_{i}}) P ({I_{i}}, {O_{i}}; t),

\textless A (t)⟩ = {I_{i}}, {O_{i}} \sum A ({I_{i}}, {O_{i}}) P ({I_{i}}, {O_{i}}; t),

\textless A (t)⟩

\textless A (t)⟩

⟨ A (t)⟩ \propto \int Π_{i} d α_{i} d α_{i}^{*} d β_{i} d β_{i}^{*} A ({α_{i}}, {β_{i}}) e^{- S [α_{i}^{*}, β_{i}^{*}, α_{i}, β_{i}]} .

⟨ A (t)⟩ \propto \int Π_{i} d α_{i} d α_{i}^{*} d β_{i} d β_{i}^{*} A ({α_{i}}, {β_{i}}) e^{- S [α_{i}^{*}, β_{i}^{*}, α_{i}, β_{i}]} .

S [[α_{i}^{*}, β_{i}^{*}, α_{i}, β_{i}] = i \sum [\int_{0}^{t_{f}} {α_{i}^{*} (t) \frac{\partial α _{i} ( t )}{\partial t} + β_{i}^{*} (t) \frac{\partial β _{i} ( t )}{\partial t}} + H (α_{i}^{*}, β_{i}^{*}, α, β)] d t .

S [[α_{i}^{*}, β_{i}^{*}, α_{i}, β_{i}] = i \sum [\int_{0}^{t_{f}} {α_{i}^{*} (t) \frac{\partial α _{i} ( t )}{\partial t} + β_{i}^{*} (t) \frac{\partial β _{i} ( t )}{\partial t}} + H (α_{i}^{*}, β_{i}^{*}, α, β)] d t .

⟨ A (t)⟩ \propto \int Π_{i} D [ϕ^{*}, ϕ, ψ^{*}, ψ] A ({ϕ}, {ψ}) e^{- S [ψ^{*}, ϕ^{*}, ψ, ϕ]}

⟨ A (t)⟩ \propto \int Π_{i} D [ϕ^{*}, ϕ, ψ^{*}, ψ] A ({ϕ}, {ψ}) e^{- S [ψ^{*}, ϕ^{*}, ψ, ϕ]}

S [ψ^{*}, ϕ^{*}, ψ, ϕ] = \int_{0}^{t_{f}} [{ψ^{*} (t) \frac{\partial ψ ( t )}{\partial t} + ϕ^{*} (t) \frac{\partial ϕ ( t )}{\partial t}} + H (ψ^{*}, ϕ^{*}, ψ, ϕ)] d t .

S [ψ^{*}, ϕ^{*}, ψ, ϕ] = \int_{0}^{t_{f}} [{ψ^{*} (t) \frac{\partial ψ ( t )}{\partial t} + ϕ^{*} (t) \frac{\partial ϕ ( t )}{\partial t}} + H (ψ^{*}, ϕ^{*}, ψ, ϕ)] d t .

H = - γ_{I} (- \overset{ˉ}{ϕ}_{I} + \frac{ϕ ˉ _{I}^{2}}{2}) ϕ_{I} - F (\overset{ˉ}{ϕ_{I}} + \frac{ϕ ˉ _{I}^{2}}{2}) - γ_{O} (- \overset{ˉ}{ψ}_{O} + \frac{ψ ˉ _{O}^{2}}{2}) ψ_{O} - R (ϕ_{I}) (\overset{ˉ}{ψ}_{O} + \frac{ψ ˉ _{O}^{2}}{2}) .

H = - γ_{I} (- \overset{ˉ}{ϕ}_{I} + \frac{ϕ ˉ _{I}^{2}}{2}) ϕ_{I} - F (\overset{ˉ}{ϕ_{I}} + \frac{ϕ ˉ _{I}^{2}}{2}) - γ_{O} (- \overset{ˉ}{ψ}_{O} + \frac{ψ ˉ _{O}^{2}}{2}) ψ_{O} - R (ϕ_{I}) (\overset{ˉ}{ψ}_{O} + \frac{ψ ˉ _{O}^{2}}{2}) .

H

H

R (ϕ_{I}) = 0 \sum \infty \frac{c _{n}}{n !} (δ ϕ_{I})^{n},

R (ϕ_{I}) = 0 \sum \infty \frac{c _{n}}{n !} (δ ϕ_{I})^{n},

S [\tilde{Ψ}, Ψ]

S [\tilde{Ψ}, Ψ]

⟨ Ψ \tilde{Ψ} ⟩ = \frac{\int D [ i Ψ ~ ] \int D [ Ψ ] Ψ Ψ ~ e ^{- S [\tilde{Ψ}, Ψ]}}{\int D [ i Ψ ~ ] \int D [ Ψ ] e ^{- S [\tilde{Ψ}, Ψ]}} .

⟨ Ψ \tilde{Ψ} ⟩ = \frac{\int D [ i Ψ ~ ] \int D [ Ψ ] Ψ Ψ ~ e ^{- S [\tilde{Ψ}, Ψ]}}{\int D [ i Ψ ~ ] \int D [ Ψ ] e ^{- S [\tilde{Ψ}, Ψ]}} .

Z [\tilde{J}, J] = ⟨ exp \int_{t} α \sum (\tilde{J}_{α} (t) \tilde{Ψ}_{α} (t) + J_{α} (t) Ψ_{α} (t))⟩

Z [\tilde{J}, J] = ⟨ exp \int_{t} α \sum (\tilde{J}_{α} (t) \tilde{Ψ}_{α} (t) + J_{α} (t) Ψ_{α} (t))⟩

S_{0} [\tilde{Ψ}, Ψ] = \int_{w} α \sum (\tilde{Ψ}_{α} (- w) Ψ_{α} (- w)) M (\tilde{Ψ}_{α} (w) Ψ_{α} (w))

S_{0} [\tilde{Ψ}, Ψ] = \int_{w} α \sum (\tilde{Ψ}_{α} (- w) Ψ_{α} (- w)) M (\tilde{Ψ}_{α} (w) Ψ_{α} (w))

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Optimal information transfer in enzymatic networks: A field theoretic formulation

Himadri S. Samanta

Department of Chemistry, The University of Texas at Austin, TX 78712

Michael Hinczewski

Department of Physics, Case Western Reserve University, OH 44106

D. Thirumalai

Department of Chemistry, The University of Texas at Austin, TX 78712

Abstract

Signaling in enzymatic networks is typically triggered by environmental fluctuations, resulting in a series of stochastic chemical reactions, leading to corruption of the signal by noise. For example, information flow is initiated by binding of extracellular ligands to receptors, which is transmitted through a cascade involving kinase-phosphatase stochastic chemical reactions. For a class of such networks, we develop a general field-theoretic approach in order to calculate the error in signal transmission as a function of an appropriate control variable. Application of the theory to a simple push-pull network, a module in the kinase-phosphatase cascade, recovers the exact results for error in signal transmission previously obtained using umbral calculus (Phys. Rev. X., 4, 041017 (2014)). We illustrate the generality of the theory by studying the minimal errors in noise reduction in a reaction cascade with two connected push-pull modules. Such a cascade behaves as an effective three-species network with a pseudo intermediate. In this case, optimal information transfer, resulting in the smallest square of the error between the input and output, occurs with a time delay, which is given by the inverse of the decay rate of the pseudo intermediate. Surprisingly, in these examples the minimum error computed using simulations that take non-linearities and discrete nature of molecules into account coincides with the predictions of a linear theory. In contrast, there are substantial deviations between simulations and predictions of the linear theory in error in signal propagation in an enzymatic push-pull network for a certain range of parameters. Inclusion of second order perturbative corrections shows that differences between simulations and theoretical predictions are minimized. Our study establishes that a field theoretic formulation of stochastic biological signaling offers a systematic way to understand error propagation in networks of arbitrary complexity.

I Introduction:

Cell signaling involves the ability of cells to detect changes in the environment and respond to them Goldbeter81PNAS ; Thattai01PNAS ; Thattai02BJ ; Eldar:2010aa ; Arjun08Cell ; Maheshri:2007aa , a fundamental necessity of living systems. Several signaling networks involve proteins, which switch between active and inactive states. By quantitatively describing how different signaling proteins are functionally linked, we can understand the behavior of signaling pathways, and the associated bandwidth that determines fidelity of information transfer Bowsher:2013aa . In typical enzymatic networks, environmental information is transmitted into the cell interior through cascades of stochastic biochemical reactions cai08Nature . Noise inevitably propagates through the cascade, potentially corrupting the signal. Depending on the parameters, small changes in the input can be translated into large (but noise corrupted) output variations. The amplification is essential but it must also preserve the signal content to be useful for downstream processes. The signaling circuit, despite operating in a noisy environment, needs to maintain high fidelity between output and the amplified input Becker15PRL . Over the years concepts in information theory have been adopted to assess the fidelity of signal transmission in the context of biochemical network Lestas10Nature ; Bode50PIRE . Several studies have used mutual information between input and output signals to quantify the reliability of signal transduction deRonde10PRE ; Ziv07PlosOne ; Tkacik08PRE ; Walczak09PNAS ; Mehta09MolCellBiol ; Mugler09PRE . The formalism has been applied to the study of a variety of networks including cascades and networks with feedback deRonde10PRE . These and other studies have expanded over understanding of the fidelity of information transfer in biological networks in which both noise and copy number fluctuations are important.

In a recent paper MH14PRX , we considered the problem of how to extract information faithfully from noisy signals using mathematical methods developed in the context of communication theory developed over sixty years ago by Wiener Wiener49 and independently by Kolmogorov Kolmogorov41 . The Wiener-Kolmogorov (WK) approach has since proven a useful tool in a variety of contexts in biological signaling MH16JPCB ; Becker15PRL ; Hathcock16 . The WK theory, reformulated by Bode and Shannon Bode50PIRE , assumes that the input and output are continuous variables that describe stationary stochastic processes. The goal of approach is to minimize the mean squared error between the input and output signals, but the optimization is restricted to the space of only linear noise filters. Recently, we developed an analytic formalism of general validity to overcome some of the limitations of the WK theory based on exact techniques involving umbral calculus Roman05 . We illustrated the efficacy of the non-linear theory with applications to push-pull network and its variants including instances when the input is time-dependent.

The use of non-standard mathematics in the form of umbral calculus, perhaps, obscures the physics of optimal filtering in biological networks in which the effects of non-linearities in signal amplification have to be considered. Here, we develop an alternate general formalism based on a many body formulation of reaction diffusion equations introduced by Doi and Peliti doi76JPA ; peliti85JP . This formulation converts the signal optimization problem to a standard field theory, allowing us to calculate the response and correlation functions by standard methods. The advantage of this formalism is that both discrete and continuum cases can be studied easily. Non-linear contributions can be obtained using systematic diagrammatic perturbation scheme for an arbitrary network. Networks where temporal dynamics are coupled with spatial gradients in signaling activities, which regulate intracellular processes and signal propagation across the cell, can also be investigated using the present formalism. Application of the theory to a push-pull network and a simplified biochemical network recovers the exact results obtained in our previous study. We also extend the formalism to solve signal transduction in a cascade, which serves as a model for a variety of biological networks. The formalism is general and is applicable to arbitrary networks with feedback, time delay and special variations Silva:aa . Our work exploits standard methods in physics, illustrating the usefulness of a field theoretic formulation at the interface of communication theory and biology.

II Theory:

Linear Push-Pull Network:

In order to develop the many body formalism for a general signaling network, we first consider a simple model. The concepts and the general diagrammatic expansion developed in this context, lays the foundation for applications to more complicated enzymatic networks as well as signaling cascades. In a typical signaling pathway, for example the mitogen activated protein kinase (MAPK) Schoeberl02Nature ; Hersen08PNAS pathway, external and environmental fluctuations activate a cascade of enzymatic reactions, thus transmitting information across the membrane in a sequential manner. Each step involves activation of kinases by phosphorylation reaction and deactivation by phosphatases Levine:aa ; Stadtman:1977aa ; Detwiler:aa ; Heinrich:aa ; Kolch:2015aa . A truncated version of such a cascade is a single step (Fig.1), which we refer to as a push-pull network MH14PRX . In this signaling network, there are only two chemical species. One is $I(t)$ (the ”input”) and the other is $O(t)$ (the ”output”) whose production depends on $I(t)$ . The upstream pathway, which serves as an external signal, creates the species $I$ by the reaction $\phi\xrightarrow[]{F}I$ with an effective production rate $F$ . The output $O$ is a result of the reaction $I\xrightarrow[]{R(I)}I+O$ , with a rate $R(I(t))$ that depends on the input. The species are deactivated through $I\xrightarrow[]{\gamma_{I}}\phi$ and $O\xrightarrow[]{\gamma_{O}}\phi$ with rates $\gamma_{I}$ and $\gamma_{O}$ respectively, mimicking the role of phosphatases (Fig.1). The input varies over a characteristic time scale $\gamma_{I}^{-1}$ , fluctuating around the mean value $\bar{I}=F/\gamma_{I}$ . The degradation rate sets the time scale $\gamma_{O}^{-1}$ over which $O(t)$ responds to changes in the input.

The chemical Langevin equations describing the changes in $I$ and $O$ are,

[TABLE]

where $\eta_{I}$ and $\eta_{O}$ are Gaussian white noise with zero mean ( $\langle\eta_{\alpha}\rangle=0$ ) and correlation $\textless\eta_{\alpha}(t)\eta_{\alpha}^{\prime}(t^{\prime})\textgreater=2\sqrt{\gamma_{\alpha}\bar{\alpha}}\delta_{\alpha\alpha^{\prime}}\delta(t-t^{\prime})$ with $\alpha=I,O$ and $\bar{\alpha}$ is the mean population $\alpha$ . For small fluctuations, $\delta\alpha(t)=\alpha(t)-\bar{\alpha}$ , Eq.(1) can be solved using a linear approximation for the rate function $R(I(t))\approx R_{0}\bar{I}+R_{1}\delta I(t)$ , with coefficients $R_{0},R_{1}\textgreater 0$ . The result is

[TABLE]

where in the second line an arbitrary scaling factor $G$ has been introduced. The solution for $\delta O(t)$ has the structure of a linear noise filter equation; $\tilde{s}=\int_{-\infty}^{t}dt^{\prime}H(t-t^{\prime})c(t^{\prime})$ , with $c(t)=s(t)+n(t)$ . The signal $s(t)=G\delta I(t)$ together with the noise term $n(t)\equiv GR_{1}^{-1}\eta_{O}(t)$ constitute the corrupted signal, $c(t)$ . The output $\tilde{s}(t)\equiv\delta O(t)$ is produced by convolving $c(t)$ with a linear kernel $H(t)\equiv R_{1}G^{-1}\exp(-\gamma_{O}t)$ , which filters the noise. As a consequence of causality, the filtered output $\tilde{s}$ at time $t$ depends only on $c(t^{\prime})$ from the past.

The primary goal in transmitting signal with high fidelity is to devise an optimal causal filter, $H_{opt}(t)$ , which renders $\tilde{s}(t)$ as close to $s(t)$ as possible. In a remarkable development, Weiner Wiener49 and Kolmogorov Kolmogorov41 independently discovered a solution to this problem in the context of communication theory, which launched the modern era in signal decoding from time series. In particular, WK proposed a solution that minimizes the square of the differences between $\tilde{s}$ and $s(t)$ by seeking an optimal filter $H_{WK}(t)$ among all possible linear filters. In the push-pull network, this means having $\delta O(t)$ reproduce as accurately as possible the scaled input signal $G\delta I(t)$ . For a particular $\delta I(t)$ and $\delta O(t)$ , the value of the mean squared error $E=\langle(\tilde{s}-s)^{2}\rangle/\langle s^{2}\rangle$ is smallest when $G=\langle(\delta O)^{2}\rangle/\langle\delta O\delta I\rangle$ , which we identify as a gain factor. In this case, $E=1-\langle\delta O\delta I\rangle^{2}/(\langle(\delta O)^{2}\rangle\langle(\delta I)^{2}\rangle)$ .

The optimal causal filter $H_{WK}$ satisfies the following Wiener-Hopf equationMH14PRX ; Becker15PRL ,

[TABLE]

where $C_{xy}(t)\equiv\langle x(t^{\prime})y(t^{\prime}+t)\rangle$ is the correlation between points in the time series $x$ and $y$ , assumed to depend only on the time difference $t-t^{\prime}$ . We can evaluate the correlation functions $C_{cs}$ and $C_{cc}$ using Eq. (2), and substituting these solutions in Eq. (3), the optimal filter function can be solved by assuming a generic ansatz, $H_{WK}(t)=\sum_{i=1}^{N}A_{i}\exp(-\lambda_{i}t)$ . The unknown coefficients, $A_{i}$ , and the associated rate constants $\lambda_{i}$ are found by comparing the left and right hand sides of Eq. (3). Elsewhere MH14PRX , we showed that $H_{WK}(t)=\gamma_{I}(\sqrt{1+\Lambda}-1)\exp(\gamma_{I}\sqrt{1+\Lambda}t)$ . The conditions for achieving WK optimality, $H(t)=H_{WK}(t)$ , are MH14PRX ,

[TABLE]

leading to the minimum relative error,

[TABLE]

The fidelity between the output and input is described through a single dimensionless optimality control parameter, $\Lambda$ , which can be written as $\Lambda\equiv(R_{0}/\gamma_{I})(R_{1}/R_{o})^{2}$ . The first term, $R_{0}/\gamma_{I}$ , is a burst factor, measuring the mean number of output molecules produced per input molecule during the active lifetime of the input molecule. The second term, $(R_{1}/R_{o})^{2}$ , is a sensitivity factor, reflecting the local response of the production function $R(I)$ near $\bar{I}$ (controlled by the slope $R_{1}=R^{\prime}(\bar{I})$ ) relative to the production rate per input molecule $R_{0}=R(\bar{I})/\bar{I}$ .

In our recent work MH14PRX , we extended the WK approach to include non-linearity and the discrete nature of the input and output molecules $I$ and $O$ MH14PRX . Both these considerations are relevant in biological circuits where $R(I)$ is non-linear and the copy numbers of $I$ and $O$ are likely to be small. Starting from the exact master equation, valid for discrete populations and arbitrary $R(I)$ , we rigorously solved the original optimization problem for the error $E$ between output and input using the principles of umbral calculusRoman05 . The main results are as follows. For any arbitrary function expanded as,

[TABLE]

with $v_{n}(I)=\sum_{m=0}^{\infty}(n-m)!(-\bar{I})^{m}\begin{pmatrix}n\\ m\end{pmatrix}\begin{pmatrix}I\\ n-m\end{pmatrix}$ and $\sigma_{n}=\textless v_{n}(I)R(I)\textgreater/(\bar{I}^{n}n!)$ , the relative error can be expressed by an exact expression,

[TABLE]

The expression above is bounded from below by

[TABLE]

where $\tilde{\Lambda}=\bar{I}\sigma_{1}^{2}/(\sigma_{0}\gamma_{I})$ . The equality is only reached when $\gamma_{O}=\gamma_{I}\sqrt{1+\tilde{\Lambda}}$ and $R(I)$ is an optimal linear filter of the form, $R_{opt}(I)=\sigma_{0}+\sigma_{1}(I-\bar{I})$ , with all $\sigma_{n}=0$ for $n\geq 2$ . Obtaining the lower bound is important for noise reduction in biological networks as it provides insights into energy costs required to reduce the error Lestas10Nature .

Field theoretic formulation:

In order to generalize the results in our previous study MH14PRX to arbitrary regulatory networks, we adopt a many body approach pioneered by Doi and Pelitidoi76JPA ; peliti85JP . Such an approach has been used in the study of a variety of reaction diffusion equations Lee95JSP ; Cardy98JSP . Besides suggesting plausible new ways of examining how signals are transmitted in biochemical reaction networks, the current theory shows how standard field theoretic methods can be adopted for use in control theory. By way of demonstrating its utility, we rederive the exact analytical solution (Eq.(7)) for the relative error in the push-pull network. In the Doi-Peliti formalism, the configurations at time $t$ in a locally interacting many body system are specified by the occupation numbers of each species on a lattice site $i$ . In our case, $I_{i}$ is the input population and $O_{i}$ is the output population. As a consequences of the stochastic dynamics, the on-site occupation numbers are modified. Arbitrarily many particles of either population are allowed to occupy any lattice site. In other words, $I_{i}$ , $O_{i}=0,1,\cdots\infty$ . The master equation for the local reaction scheme that governs the time evolution of the configurational probability with $I_{i}$ input and $O_{i}$ output at site $i$ at time $t$ is obtained through the balance of gain and loss terms. The result is,

[TABLE]

We use the Fock space representation to account for the changes in the site occupation number by integer values for the chemical reactions describing the network. Following Doi and Peliti, we introduce the bosonic ladder operator algebra with commutation relation $[a_{i},a_{j}]=0$ , $[a_{i},a_{j}^{\dagger}]=\delta_{ij}$ for the input population, allowing us to construct the input particle number eigenstates $|I_{i}\rangle$ obeying $a_{i}|I_{i}\rangle=I_{i}|I_{i}-1\rangle$ , $a_{i}^{\dagger}|I_{i}\rangle=|I_{i}+1\rangle$ , $a_{i}^{\dagger}a_{i}|I_{i}\rangle=I_{i}|I_{i}\rangle$ . A Fock state with $I_{i}$ particles on site $i$ is obtained from the vacuum state $|0\rangle$ , defined by the relation $a_{i}|0\rangle=0$ , and $|I_{i}\rangle={a_{i}^{\dagger}}^{I_{i}}|0\rangle$ . Similarly, we introduce annihilation and creation operators for output particles $b_{i}$ and $b_{i}^{\dagger}$ that commute with the input ladder operators: $[a_{i},b_{j}]=0=[a_{i},b_{j}^{\dagger}]$ .

Stochastic kinetics for the entire lattice is implemented by considering the master equation for the configurational probability $P(\{I_{i}\},\{O_{j}\},t)$ , given by a sum over all lattice points on the right hand side of Eq.(II), by noting that a general Fock state is constructed by the tensor product $|\{I_{i}\},\{O_{j}\}\rangle=\Pi_{i}|I_{i}\rangle|O_{i}\rangle$ . We define a time dependent formal state vector through a linear combination of all possible Fock states, weighted by their configurational probability at time $t$ ,

[TABLE]

This superposition state encodes the stochastic temporal evolution. We use standard methods to transform the time dependence from the linear master equation into an imaginary time Schrödinger equation, governed by a time-dependent stochastic evolution operator $H$ ,

[TABLE]

We may multiply Eq.(II) by ${a_{i}^{\dagger}}^{I_{i}}{b_{i}^{\dagger}}^{O_{i}}|0\rangle$ , and sum over all values of $I_{i},O_{i}$ . With the definition of the state $|\psi(t)\rangle$ ,

[TABLE]

the $\gamma_{I}$ term, i.e. $\gamma_{I}[(I_{i}+1)P(I_{i}+1,O_{i},t)-I_{i}P(I_{i},O_{i},t)]$ , in Eq.(II) becomes,

[TABLE]

By relabeling the indices in the first sum, we arrive at the desired Hamiltonian expressed in second quantized representation as, $H_{\gamma_{I}}=-\gamma_{I}(1-a_{i}^{\dagger})a_{i}.$ Similarly, terms with coefficients $F$ , $\gamma_{O}$ and $R(I_{i})$ in Eq.(II) give the following contributions, $H_{F}=-F(a_{i}^{\dagger}-1),~{}H_{\gamma_{O}}=-\gamma_{O}(1-b_{i}^{\dagger})b_{i},~{}H_{R(I_{i})}=-R(a_{i}^{\dagger}a_{i})(b_{i}^{\dagger}-1).$ The total Hamiltonian $H$ takes the following form,

[TABLE]

A convenient choice for the initial configuration for the master equation describing the stochastic particle reactions is an independent Poisson distribution at each site,

[TABLE]

with mean initial input and output concentrations $\bar{I_{0}}$ and $\bar{O}_{0}$ . Just as in quantum mechanics, Eq.(11) can be formally solved leading to,

[TABLE]

with the initial state $|\psi\rangle=e^{\bar{I}_{0}\sum_{i}(a_{i}^{\dagger}-1)+\bar{O}_{0}\sum_{i}(b_{i}^{\dagger}-1)}|0\rangle$ .

Our goal is to compute averages and correlation functions with respect to the configurational probability $P(\{I_{i}\},\{O_{i}\};t)$ , which is accomplished by means of the projection state $\textless\mathcal{P}|=\textless 0|\Pi_{i}e^{a_{i}+b_{i}}$ , for which $\textless\mathcal{P}|0\rangle=1$ and $\textless\mathcal{P}|a_{i}^{\dagger}=\textless\mathcal{P}|=\textless\mathcal{P}|b_{i}^{\dagger}$ , since $[e^{a_{i}},a_{j}^{\dagger}]=e^{a_{i}}\delta_{ij}$ . The average value of an observable $A(\{I_{i}\},\{O_{i}\})$ is,

[TABLE]

from which the statistical average of an observable can be calculated using,

[TABLE]

We follow a well-established route in quantum many particle theory Negele88 , and proceed towards a field theory representation by constructing a path integral equivalent of the time dependent Schrödinger equation (Eq.(11)) based on coherent states Tauber14 . These are defined as right eigenstates of the annihilation operators, $a_{i}|\alpha_{i}\rangle=\alpha_{i}|\alpha_{i}\rangle$ and $a_{i}|\beta_{i}\rangle=\beta_{i}|\beta_{i}\rangle$ , with complex eigenvalues $\alpha_{i}$ and $\beta_{i}$ . The coherent states satisfy $|\alpha_{i}\rangle=\exp(\frac{1}{2}|\alpha_{i}|^{2}+\alpha_{i}\alpha_{i}^{\dagger})|0\rangle$ , the overlap integral $\textless\alpha_{j}|\alpha_{i}\rangle=\exp(-\frac{1}{2}|\alpha_{i}|^{2}-\frac{1}{2}|\alpha_{j}|^{2}+\alpha_{j}^{*}\alpha_{i})$ , and the completeness relation $\int\Pi_{i}d^{2}\alpha_{i}|\{\alpha_{i}\}\rangle\textless\{\alpha_{i}\}|=\pi$ . After splitting the temporal evolution (Eq.(11)) into infinitesimal increments, inserting the completeness relation at each time step, and with additional manipulations leads to an expression for the configurational average,

[TABLE]

The exponential statistical weight is determined by the action,

[TABLE]

Finally, by taking the continuum limit using $\sum_{i}\rightarrow a_{0}^{-d}\int d^{d}x$ , $a_{0}$ is a lattice constant, $\alpha_{i}(t)\rightarrow\phi(x,t)$ , $\beta_{i}(t)\rightarrow\psi(x,t)$ and $\alpha_{i}(t)\rightarrow a_{0}^{d}\phi(x,t)$ , $\beta_{i}^{*}(t)\rightarrow a_{0}^{d}\psi^{*}(x,t)$ , the expectation value is represented by a functional integral,

[TABLE]

with an effective action

[TABLE]

In the Hamiltonian (Eq.(14)), $a^{\dagger}$ and $b^{\dagger}$ are replaced by the field variables $\phi^{*}$ and $\psi^{*}$ , respectively. Similarly, $a$ and $b$ operators become $\phi$ and $\psi$ respectively.

The action in Eq.(22) encodes the stochastic master equation kinetics through four independent fields ( $\psi^{*},\phi^{*},\psi,\phi$ ). With this formulation, an immediate connection can be made to the response functional formulation using the Janssen - De Dominicis formalism for Langevin equations Dominicis76JPC ; Janssen76ZPB . In this approach, the response field enters at most quadratically in the pseudo-Hamiltonian, which may be interpreted as averaging over Gaussian white noise. With this in mind, we apply the non-linear Cole-Hopf transformation Cole51QAM ; Hopf50CPAM , in order to obtain quadratic terms in auxiliary fields, $\phi^{*}=e^{\bar{\phi}_{I}},\ \phi=e^{-\bar{\phi}_{I}}\phi_{I},\ \psi^{*}=e^{\bar{\psi}_{O}},\ \psi=e^{-\bar{\psi}_{O}}\psi_{O}$ , to the action in Eq.(22). The Jacobian for this variable transformation is unity, and the local particle densities are $\phi^{*}\phi=\phi_{I}$ and $\psi^{*}\psi=\psi_{O}$ . We obtain the following Hamiltonian,

[TABLE]

In the above equation, the exponential term has been expanded to second order. The rate equations are obtained through $\delta\mathcal{S}/\delta\bar{\psi}\mid_{\bar{\psi}=0}=0$ and $\delta\mathcal{S}/\delta\bar{\phi}\mid_{\bar{\phi}=0}=0$ . The terms quadratic in the auxiliary fields ( $\bar{\psi}$ and $\bar{\phi}$ ) encapsulate the second moment of the Gaussian white noise with zero mean.

In order to obtain fluctuation corrections needed to calculate minimum error in signal transduction, we write the action in terms of fluctuating fields, $\delta\phi_{I}=\phi_{I}-\langle\phi_{I}\rangle$ and $\delta\psi_{O}=\psi_{O}-\langle\psi_{O}\rangle$ as,

[TABLE]

where we have expanded $R(\phi_{I})$ in a Taylor series,

[TABLE]

with constant $c_{n}$ . Note this expansion differs from the one used in Eq.(6). The coefficients of $\bar{\phi}_{I}^{2}$ and $\bar{\psi}_{O}^{2}$ reflect the noise correlations in Langevin description.

In Fourier space the action becomes

[TABLE]

where $\tilde{\Psi}$ represents the set $\{\bar{\phi}_{I},\bar{\psi}_{O}\}$ and ${\Psi}$ denotes $\{\phi_{I},\psi_{O}\}$ . The non-linear contribution to the action is $\mathcal{S}_{int}[\tilde{\Psi},\Psi]=\int_{w}\bar{\psi}_{O}[\frac{c_{2}}{2}\delta\phi_{I}(w_{1})\delta\phi_{I}(w-w_{1})]+\cdots$ . Physical quantities can be expressed in terms of correlation functions of fields $\Psi$ and $\tilde{\Psi}$ , taken with the statistical weight $e^{-\mathcal{S}[\tilde{\Psi},\Psi]}$ ,

[TABLE]

In order to compute the correlation function involving response fields, it is useful to introduce the generating functional,

[TABLE]

where $\alpha$ represents the set $\{\phi_{I},\psi_{O}\}$ , for which the required correlation functions are obtained via functional derivatives of $\mathcal{Z}$ with respect to the appropriate source fields.

The procedure is readily implemented for the Gaussian theory with statistical weight $e^{-\mathcal{S}_{0}[\tilde{\Psi},\Psi]}$ . In Fourier space, we can write the harmonic function as,

[TABLE]

with the Hermitian coupling, a (4,4) matrix $\mathcal{M}(w)$ . With the aid of Gaussian integrals, we obtain,

[TABLE]

From Eq.(30), we now directly infer the matrix of two point correlation functions in the Gaussian ensemble with the inverse of harmonic coupling matrix $\mathcal{M}$ .

III Applications:

As a first application we apply the field-theoretic formalism to the push-pull network, which can be exactly solved for the error (Eq.(5)). In the process we illustrate the way the diagrammatic expansion works in the context of signaling networks, making it possible to apply the theory to more complicated systems.

A. Push-Pull network:

The calculation of the error (Eq.(5)) in terms of the control variable (the average number of phosphatase molecules per cell ( $\bar{P}$ )) requires the correlation functions $\langle\delta O\delta I\rangle$ , $\langle\delta O^{2}\rangle$ and $\langle\delta I^{2}\rangle$ . These can be expressed in terms of the matrix elements of $\left(\mathcal{M}^{-1}\right)_{mn}$ (Eq.(29)). Subscripts $m$ and $n$ represent the $m^{th}$ row and $n^{th}$ column, respectively. For example, $\left(\mathcal{M}^{-1}\right)_{33}$ is the correlation function $\langle\delta\phi_{I}(-w)\delta\phi_{I}(w)\rangle$ . Similarly we can obtain other correlation functions. Now we can compute, power spectra for the input and output molecules by evaluating the correlation functions of kinase and substrate populations by using Eq. (30). We use perturbation theory for the action corresponding to the push-pull network to compute the non-linear contribution to the correlation function.

We obtain the following expressions for the power spectra,

[TABLE]

The $\langle\cdots\rangle_{0}$ is taken with respect to the non-interacting theory ( $S_{int}[\tilde{\Psi},\Psi]=0$ in Eq.(II)). Using these functions, the error ( $E$ ) and gain ( $G$ ) are given by,

[TABLE]

By inserting the expressions for the correlation functions in Eq.(III) into Eq.(32), and integrating over $w$ , we obtain the minimum relative error for the linear push-pull network,

[TABLE]

Higher order corrections to the power spectra $\langle\delta\psi_{O}(-w)\delta\psi_{O}(w)\rangle$ are calculated using perturbation theory by evaluating the Feynman diagrams (Fig.(2)),

[TABLE]

For example, the second order contribution to the $\langle\delta\psi_{O}(-w)\delta\psi_{O}(w)\rangle$ arising from the loop in Fig.(2) is $\Omega_{2}^{2}\frac{2!\bar{I}^{2}}{\gamma_{O}(\gamma_{O}+2\gamma_{I})}$ (see Appendix A for details). The coefficient $\Omega_{2}^{2}$ is given by $\Omega_{2}^{2}=\frac{c_{2}^{2}}{4}+\frac{c_{3}^{2}}{4}+\frac{\bar{I}}{4}c_{2}c_{4}+\cdots$ . Higher order terms have a similar structure: for example, the third order contribution to the power spectra is $\Omega_{3}^{2}\frac{3!\bar{I}^{2}}{\gamma_{O}(\gamma_{O}+3\gamma_{I})}$ , with $\Omega_{3}^{2}=\frac{c_{3}^{2}}{36}+\frac{c_{4}^{2}}{16}+\frac{\bar{I}}{36}c_{3}c_{5}+\cdots$ . By evaluating all the diagrams in Fig.(2), we obtain the final expression for the relative error,

[TABLE]

The form of the result in Eq.(35) coincides with the exact expression (Eq.(7)) for the relative error previously obtained MH14PRX by using an entirely different approach based on umbral calculus. However, the coefficients $\Omega_{n}$ are expressed in terms of the coefficients $c_{n}$ used in the series for $R(I)$ (Eq.(25)) rather than $\sigma_{n}$ . The two kinds of coefficients are non-trivially related through,

[TABLE]

where $S_{pq}$ are Stirling’s numbers of second kind. For all $n$ , the leading order term $\frac{c_{n}^{2}}{n!^{2}}$ of $\Omega_{n}$ is the same as the leading order term of $\sigma_{n}$ .

The sum within the bracket in Eq.(35) is composed of non-negative terms. The minimal sum $E$ is obtained by setting $\Omega_{n}=0$ for all $n\geq 2$ . Thus, $E$ is bounded from below by $E\geq 1-\frac{\bar{I}\gamma_{O}^{2}\sigma_{1}^{2}}{(\gamma_{I}+\gamma_{O})^{2}}\left[\gamma_{O}\sigma_{0}+\sigma_{1}^{2}\frac{\gamma_{O}\bar{I}}{\gamma_{O}+\gamma_{I}}\right]^{-1}$ . The term on the right hand side is minimized with respect to $\gamma_{o}$ when $\gamma_{o}=\gamma_{I}\sqrt{1+\tilde{\Lambda}}$ , with $\tilde{\Lambda}=\bar{I}\sigma_{1}^{2}/\sigma_{0}\gamma_{I}$ . At the optimal $\gamma_{O}$ , the equality becomes $E=2/(1+\sqrt{1+\tilde{\Lambda}})\equiv E_{opt}$ . As $\sigma_{1}$ increases, $\tilde{\Lambda}$ becomes large which is desirable for high fidelity signal transduction. As long as $R(I)$ is approximately linear in the vicinity of $\bar{I}$ , the corrections $\sigma_{n}$ (or $\Omega_{n}$ ) for $n>2$ are negligible, and $E$ is close to $E_{opt}$ . The coefficients $\sigma_{n}$ for $n>2$ must be non-negligible when $\sigma_{1}$ is sufficiently large. Such a highly sigmoidal input-output response, known as ultra-sensitivity Goldbeter81PNAS , is biologically realizable in certain regimes of signaling cascades. In the limit of a nearly step-like response, non-linearity in $R(I)$ becomes appreciable around $\bar{I}$ , distorting the output signal and leading to $E$ that is larger than $E_{opt}$ . Because $E$ increases with $\tilde{\Lambda}$ in this limit, the benefits of ultra-sensitivity vanish.

B. Signaling Cascades:

A natural extension is to consider a cascade created by an array of connected push-pull networks. Indeed, in some biological signaling pathways external perturbation is transmitted through a cascade of reactions involving successive activation by kinases and deactivation by phosphatases. An example is the stimulation of a receptor tyrosin kinase by epidermal growth factor, which results in downstream responses of the MAPK network Heinrich02MC ; Schoeberl02Nature .

Because sections $B$ , $C$ and $D$ are related, we explain briefly the results in order to ensure that the relationship between these sections are clear. In this section we describe the two cascade network using the field theory framework, and the coarse-graining procedure needed for obtaining an analytic expression for optimal error. In section $C$ , we show that the two cascade network behaves as noise filter with a time delay, $\alpha^{-1}$ . By mapping the cascade to a push-pull network with an intermediate, we show in section $D$ that $\alpha$ can be exactly calculated. Thus, the results in the three sections provide an analytic theory for optimal signaling in the two cascade network.

Consider a two step series enzymatic cascade (Fig.(3)) modeled as a sequence of two enzymatic push-pull loops stimulated by an upstream enzyme. In the first loop, an upstream enzyme, $K$ phosphorylates the substrate, $S$ , to produce $S^{*}$ , converting it from an inactive to active state. Phosphatase ( $P$ ) dephosphorylates $S^{*}$ to an inactive state $S$ . In the second loop, $S^{*}$ acts as the enzyme for the phosphorylation of $T$ and $P$ , the corresponding phosphatases. The series of chemical reactions involved in this cascade are,

[TABLE]

where $S_{K}$ , $S^{*}_{P}$ , $S^{*}_{T}$ and $T^{*}_{P}$ are the reaction intermediates, and $k_{ib}$ , $k_{i}u$ , $k_{ir}$ , $\rho_{ib}$ , $\rho_{i}u$ and $\rho_{ir}$ , $i=1,2$ , are the rate constants of the stochastic biochemical reactions in the cascade. The input signal $K+S_{K}$ is transduced into the active substrate output $T^{*}+T_{P}^{*}$ . In an insightful article Heinrich02MC , a deterministic approach was used to analyze the system of chemical reactions in Eq.(III). Here we assume that the reactions are stochastic. In order to develop analytical results we only consider fluctuations of all species that deviate linearly from their mean values. The validity of the asumption is established by comparing the results with kinetic Monte Carlo (KMC) simulations.

For the network in Fig.(3), the procedure outlined earlier leads to a Schrödinger-like equation with the following Hamiltonian,

[TABLE]

We can approximately map the two-step cascade into a two-species coarse-grained network, which acts like a noise filter, as described in detail in Ref. MH14PRX . Consider a signaling pathway (Fig.(1)) with time varying input $I(t)$ and time varying output $O(t)$ . These are the total populations (free and bound) of the input and output active kinases, with $I=K+S_{K}$ and $O=T^{*}+T_{P}^{*}$ . The upstream pathway provides an effective production rate $F$ of input $I$ , while the output $O$ results from the reaction $I\xrightarrow[]{R(I)}I+O$ . As before, $\gamma_{I}$ and $\gamma_{O}$ are the degradation rates for the input and output respectively, mimicking the role of phosphatase. The input and output correlation functions, evaluated using the field theory formalism, have the approximate structure,

[TABLE]

where we have used a linear approximation for $R(I)\approx R_{0}\bar{I}+R_{1}(I-\bar{I)}$ with $R_{0},\ R_{1}\rangle 0$ . Optimality is achieved when $\gamma_{O}=\gamma_{I}\sqrt{1+\Lambda}$ with gain $G=R_{1}/(\gamma_{I}(\sqrt{1+\Lambda}-1))$ . Relative error with the minimum $E_{WK}=2/(1+\sqrt{1+\Lambda})$ . As before, the fidelity between output and input is controlled by single dimensionless control parameter $\Lambda=(R_{1}/\gamma_{I})(R_{1}/R_{0})^{2}$ . This mapping allows us to use the general WK result for gain ( $G$ ) and the minimum relative error ( $E_{WK}$ ) to predict the optimality condition, allowing us to calculate the minimum possible value of $E$ . The results for the error in terms of the mean number of phosphatase are given by the red lines in Fig.(4).

In order to test the accuracy of our theory we simulated the dynamics of the enzymatic cascade using the KMC method. The relative error $E$ shown in Fig.(4) is in excellent agreement with the theoretical predictions. Interestingly, $E$ achieves a minimum at $\bar{P}=10^{5}$ molecules/cell, which is ten times larger than the phosphatase concentration in the one step enzymatic push-pull loop using similar parameters. Fig.(4) shows that there is a well defined narrow range of phosphatase population in which the error is minimum. The range decreases as $\Lambda$ decreases (Fig.(4)). The minimum value for the relative error does not reach the value predicted by the WK limit (Eq.(5)). As we show below, the additional error arises from an effective time delay as the signal passes from one cascade to another. We also demonstrate that the time delay can alternatively be mimicked by reducing the two cascade system to a coarse-grained pathway with an intermediate (Fig.(3b)).

C. Noise filtering with time delay:

In order to prove that the two-cascade loop effectively acts like a noise filter with time delay, we derive the condition for minimum error for the latter following the Bode-Shannon formulation of the WK theory Bode50PIRE . In this scenario, the transmitted signal can only be recovered after a constant delay, $\alpha$ . The output $O(t)$ is produced by convolving the corrupted signal (input $GI(t)$ + noise $n(t)$ ) with a causal filter $H(t)$ . In Fourier space, we obtain,

[TABLE]

where $x(w)=\int_{-\infty}^{\infty}dw~{}x(t)e^{-iwt}$ for the time series $x(t)$ . The relative error is given by Bode50PIRE ,

[TABLE]

where $P_{I}(w)$ and $P_{n}(w)$ are the power spectral densities (PSDs) of $GI(t)$ and $n(t)$ respectively. We need to minimize $E$ in Eq.(41) over all possible $H(w)$ , with the condition that $H(t)=0$ for $t<\alpha$ . The optimal causal filter has the following form Bode50PIRE ; Becker15PRL ; Hathcock16 ,

[TABLE]

The $y$ super and subscript refer to two different decompositions in the frequency domain. Causality can be enforced by noting the following conditions: (i) Any physical PSD, in this case $P_{c}(w)$ corresponding to the corrupted signal $c(t)=GI(t)+n(t)$ , can be written as $P_{c}(w)=|P_{c}^{y}(w)|^{2}$ . The factor $P_{c}^{y}(w)$ , if treated as a function in the complex $w$ plane, does not have zeros and poles in the upper half-plane ( $\text{Im}~{}w\rangle 0$ ). (ii) We also define an additive decomposition denoted by $\{F(w)\}_{y}$ for any function $F(w)$ , which consists of all terms in the partial fraction expansion of $F(w)$ with no poles in the upper half-plane. By using the PSDs, $P_{I}(w)=\frac{2G^{2}\gamma_{I}\bar{I}}{w^{2}+\gamma_{I}^{2}}$ and $P_{c}(w)=\frac{2G^{2}\gamma_{I}\bar{I}}{w^{2}+\gamma_{I}^{2}}+\frac{2G^{2}}{\gamma_{I}\Lambda}$ , we obtain the following optimal filter $H_{WK}(w)$ ,

[TABLE]

In the limit $\alpha\ll\gamma_{I}^{-1}$ , the optimal error $E_{WK}$ takes the following form Hathcock16 ,

[TABLE]

where second term in the above equation is the correction due to the time delay to the WK minimum value of the relative error for an instantaneous filter ( $\alpha\rightarrow 0$ ). The correction is positive for all values of $\alpha$ and $\Lambda$ , which implies that time delay must increase the error in signal transmission. If we add this correction to the WK minimum result for the relative error of instantaneous filter (Eq.(5)), for specific values of $\alpha$ calculated explicitly in the following section, we recover the minimum relative error in the signaling cascade. Thus, the two step enzymatic cascade minimizes the noise but behaves like a single step network with a time delayed filter.

D. Deriving the time delay $\alpha$ by mapping onto a three-species pathway with an intermediate:

Alternatively, we can derive an explicit expression for the delay parameter $\alpha$ by using a different mapping for the original cascade. Instead of mapping onto a two-species network of $I$ and $O$ with a time delay, we map onto a three-species network (Fig.(3b)) with $I$ , $M$ , and $O$ . Here there is no explicit time delay, but an additional species $M$ that will play the role of a “pseudo” intermediate mimicking the effect of the time delay. This network is governed by the reactions: $\phi\xrightarrow[]{F}I$ , $I\xrightarrow[]{R_{a}(I)}I+M$ , $M\xrightarrow[]{R_{b}(M)}M+O$ , $I\xrightarrow[]{\gamma_{I}}\phi$ , $M\xrightarrow[]{\gamma_{M}}\phi$ and $O\xrightarrow[]{\gamma_{O}}\phi$ . The production functions have the linear form: $R_{a}(I)=\sigma_{a0}+\sigma_{a1}(I-\bar{I})$ and $R_{b}(M)=\sigma_{b0}+\sigma_{b1}(M-\bar{M})$ . Earlier analysis of this network Hathcock16 has shown that it behaves like a time delayed filter, with the minimal error in the same form as Eq.(44), with $\alpha=\gamma_{M}^{-1}$ and effective $\Lambda=\Lambda_{b}\sqrt{1+\Lambda_{a}}$ , where $\Lambda_{a}=\bar{I}\sigma_{a1}^{2}/\sigma_{ao}\gamma_{I}$ and $\Lambda_{b}=\bar{M}\sigma_{b1}^{2}/\sigma_{bo}\gamma_{M}$ .

The original signaling cascade (Fig.(3a)) can be mapped onto the three-species pathway (Fig.(3b)). This involves identifying the population $S^{*}+S_{P}^{*}=M$ as a “pseudo” intermediate, with an effective degradation $\gamma_{M}$ . The mapping can be carried out by comparing PSDs between the two models. For the three-species network these are given by,

[TABLE]

Now, the PSDs for signaling cascade calculated from Doi-Peliti formalism are given by

[TABLE]

where the $w$ -independent parameters $n_{\delta I,i}$ , $n_{\delta O,i}$ , $d_{\delta I,i}$ and $d_{\delta O,i}$ are related to the rate coefficients in the cascade reactions (Eq.(III)). Here, $N=7$ corresponds to the number of independent dynamical variables ( $K,S_{K},S^{*},S_{P}^{*},T,T_{P}^{*}~{}\text{and}~{}T^{*}$ ). By mapping Eq.(46) into Eq.(45), we can extract the degradation rate of intermediate species ( $S^{*}+S_{P}^{*}$ ), $\gamma_{M}$ in terms of coefficients in Eq.(46),

[TABLE]

with $A=\frac{d_{\delta O,2}}{d_{\delta O,3}}-\gamma_{I}^{2}$ and $B=A\gamma_{I}^{2}-\frac{d_{\delta O,1}}{d_{\delta O,3}}$ . The time delay parameter $\alpha=\gamma_{M}^{-1}$ in the signaling cascade. With this identification for $\alpha$ we have a complete theory for $E$ , with no adjustable parameter, as a function of the control parameter, the mean phosphatase levels. It is tempting to speculate that a multiple ( $>2$ ) step cascade might also be mathematically equivalent to a network with a single pseudo intermediate.

E. Enzymatic Push-Pull Loop:

In considering the cascade model, we focused on the case where fluctuations around mean populations levels were small enough that the linear approximation is valid. To study the effects of non-linearity, we will look at a simpler system (one stage of the cascade) but without any constraints on the size of the fluctuations. A microscopic model for the enzymatic push-pull network is shown in Fig.(5). The upstream enzyme, $K$ phosphorylates a substrate $S$ to $S^{*}$ , thereby converting it from an inactive to an active state. The effective production rate in the upstream pathway for enzyme $K$ is $F$ . The degradation rate for $K$ is $\gamma_{K}$ . The enzyme is either free ( $K$ ) or bound to substrate ( $S_{K}$ ). The input $I$ is the total enzyme population $I=K+S_{K}$ . Phosphatase, $P$ , on the other hand dephosphorylates the active substrate $S^{*}$ to an inactive state $S$ . The output of the two phosphorylation cycle is $O=S^{*}+S_{P}^{*}$ .

The biochemical reactions for the enzymatic network with the corresponding rate constants are,

[TABLE]

In the stochastic chemical reactions that govern the phosphorylation/dephosphorylation steps, the input signal $I=K+S_{K}$ is transduced into the active substrate output $S^{*}+S_{P}^{*}$ . To derive the conditions for optimality, we follow the procedure outlined in the previous section. Starting from the master equation, we can derive a Schr $\ddot{o}$ dinger-like equation with the following Hamiltonian,

[TABLE]

The field variables $\bar{\phi}$ are associated with creation operators of corresponding population. Similarly $\phi$ correspond to annihilation operators. After using coherent-state path integral formalism, we arrive at the expression for the action corresponding to the enzymatic push-pull loop from which we calculate the power spectra for the input and output.

As in the signaling cascade network described in the previous section, we approximately map the complete enzymatic network into a noise filter MH14PRX . The input and output correlation functions, evaluated using field theory formalism, have the approximate structure given in Eq.(III). Starting from the full dynamical equations (Eq.(III)), we compute correlation functions using field theory by solving the Wiener-Hopf relation in Eq.(3), for the optimal function $H_{WK}(t)$ .

Correlation functions of input and output calculated for enzymatic push-pull loop have the approximate form of Eq.(III), with effective values of parameters $\gamma_{I}$ , $\gamma_{O}$ , $R_{1}$ and $\Lambda$ which have been expressed in terms of loop reaction rate parameters. This mapping allows us to use WK result for the gain ( $G$ ) and minimum relative error ( $E_{WK}$ ) to predict the optimality and minimum possible value of $E$ . The results for the error in terms of the mean number of phosphatase are given by the solid lines in Fig.(6).

In order to illustrate the accuracy of the theory we performed KMC simulations by choosing the forward and backward reaction rates in Eq.(III) describing the enzymatic push-pull loop network (all units are in $s^{-1}$ ) : $k_{b}=\rho_{b}=10^{-5}$ , $k_{u}=0.02$ , $\rho_{u}=0.5$ , $k_{r}=3$ , $\rho_{r}=0.3$ , $F=1$ . The deactivation rate $\gamma_{k}=0.01s^{-1}$ of enzyme $K$ which controls the characteristic time scale over which the input signal varies, mimicking the role of phosphatase. Mean free substrate and phosphatase populations are in the ranges $\bar{S}=\bar{P}\sim 10^{3}-10^{5}$ molecules/cell. Fig (6) shows that $E$ is a minimum at a particular value of phosphatase concentration $\bar{P}$ , where optimality condition is satisfied i.e. $\gamma_{O}=\gamma_{I}\sqrt{1+\Lambda}$ . For a particular value of $\Lambda=100$ , we see minimum error $E=0.18$ for the enzymatic push-pull loop. The result of the KMC simulations (purple circles) are in excellent agreement with the analytical calculation (blue line) for all $\Lambda$ values.

In the parameter space used in the results in Fig.(6), a linear theory reproduces the simulation results well. However, deviations from the predictions of the linear theory are expected if the input parameters are varied. In order to investigate these deviations we first obtained the error using the parameter values, $k_{b}=\rho_{b}=10^{-3}$ , $k_{u}=0.02$ , $\rho_{u}=0.5$ , $k_{r}=3$ , $\rho_{r}=0.3$ using KMC simulations. The relative error for $\Lambda=100$ is shown in purple line in Fig.(7). The blue line, calculated from linear theory predictions, deviates substantially from simulations (purple line in Fig.(7)). To improve the predictions of the theory we calculated second order corrections to $E$ . The result, displayed as green curve in Fig.(7)), shows that there is improved agreement between theory and simulations. The non-linear corrections, which are substantial, brings the theoretical predictions closer to the simulation results, especially near the values of $\bar{S}$ for which the error is a minimum (Fig.(7))). We suspect that higher order perturbative corrections will further improve the results based on the following observation. We fit the dependance of the error for ( $\bar{S}>1600$ ) using the function, $E(\bar{S})=a+b(\bar{S}-\bar{S}_{\text{min}})^{1.4}+c(\bar{S}-\bar{S}_{\text{min}})^{2}$ , where $a,~{}b$ and $c$ are constants and $\bar{S}$ is the value of $\bar{S}$ at which $E(\bar{S}_{\text{min}})=a$ is a minimum. The functional form of $E(\bar{S})$ is the same for the exact simulation results, and the predictions of the linear and non-linear theory except the coefficients $a,~{}b$ and $c$ are different. We, therefore, surmise that higher order terms merely renormalize the coefficients, keeping unaltered the form of relative error. Consequently, we conclude that improved estimates of $a,~{}b$ , $c$ from third and higher order contributions should produce predictions in better agreement with simulation.

IV Concluding Remarks:

In order to assess the accuracy of signal transmission, using the mean square of the error between input and output as a fidelity measure, we have developed a field theoretic formulation that allows us to predict conditions for optimal information transfer for an arbitrary stochastic chemical reaction network. The starting point is the classical master equation for interacting particle systems, which is mapped to a non-Hermitian ’quantum’ many-body Hamiltonian dynamics. Finally, the coherent-state path integral representation is utilized to arrive at a continuum field theory description that faithfully incorporates the intrinsic reaction noise and discreteness of the original stochastic processes. The formulation allows us to use standard field theory methods to compute the relative error in the information transfer using perturbation theory to all orders in non-linearity. This approach leads to an analytical expression for the minimum relative error in signal transduction. The usefulness of the general field theory formulation is illustrated through signaling networks of increasing complexity.

Detailed study of an enzymatic push pull loop, the basic unit involved in complex signaling pathways, show that it behaves like an optimal linear WK noise filter, as previously established using entirely different methods MH14PRX . In this particular case, the joint probability $P(\delta I,\delta O)$ is approximately bivariate Gaussian, which means the error $E$ is also directly related to the mutual information $M$ in bits between $\delta I$ and $\delta O$ as $E=2^{-2M}$ Hathcock16 .

The two-stage enzymatic cascade behaves as an optimal filter without achieving the minimum predicted by the WK theory. We attribute the deviation to the time delayed response of the cascade. By mapping the cascade signaling network to a three-species push-pull like model with a pseudo intermediate state we derived an explicit expression for the time delay. We show that the time delay is associated with the degradation rate of the pseudo intermediate state in the coarse-grained representation of the two-step cascade. We also demonstrate that in those cases where the linear approximation breaks down, systematic perturbative corrections can be calculated using our theory, which minimize the difference between the findings in the simulations and theoretical predictions. The success in this example illustrates the power of the formalism. Analyzing experimental data using the framework introduced here will help decipher the design principles governing signaling networks in biology, and allow us to understand the constraints imposed by noise in information transfer.

Acknowledgements: We are grateful to the National Science Foundation (CHE 16-61946) for supporting our work. Much of this work was carried out while the authors were in the Institute for Physical Sciences and Technology in the University of Maryland, College Park.

Appendix A Appendix A: Second order loop correction to the signaling error for the push-pull network:

Here, we illustrate the calculation of $E$ arising from perturbation expansion of the field theory for the push-pull network with non-linearity explained in the text. To second order the diagram needed to compute $E$ is,

[TABLE]

In the first line of the above equation, we perform the complex integration in the upper half plane by evaluating the residues at poles $w_{1}=i\gamma_{I}$ and $w_{1}=w+i\gamma_{I}$ , respectively. Similarly, in the second line we calculate the residues at poles $w=i\gamma_{O}$ and $w=2i\gamma_{I}$ .

The coefficient, $\Omega_{2}^{2}$ (Eq.(35)), is diagrammatically represented as,

where the expression for the loop in the first bracket is $2\gamma_{I}\bar{I}\int\frac{dw_{1}}{2\pi}\frac{1}{(w_{1}^{2}+\gamma_{I}^{2})}=\bar{I}$ . The coefficients $\Omega_{n}$ in Eq.(35) are functions of $c_{n}$ . In turn, $\Omega_{n}$ s and $\sigma_{n}$ s are also connected by the relation between $\sigma_{n}$ and $c_{n}$ (see main text). For all $n$ , the leading order term ( $\frac{c_{n}^{2}}{n!^{2}}$ ) of $\Omega_{n}$ and $\sigma_{n}$ is identical.

Appendix B Appendix B: Action for enzymatic push-pull network:

We give the form of the action here for the enzymatic push-pull network for which the chemical reaction scheme is given in Eq.(III). Despite the complexity, the action can be manipulated using Mathematica in order to obtain general expression for the error.

[TABLE]

Bibliography42

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1(1) A. Goldbeter and D. E. Koshland. An amplified sensitivity arising from covalent modification in biological systems. Proc. Natl. Acad. Sci. , 78:6840, 1981.
2(2) M Thattai and A van Oudenaarden. Intrinsic noise in gene regulatory networks. Proc. Natl. Acad. Sci. , 98(15):8614–8619, 2001.
3(3) M. Thattai and A. van Oudenaarden. Attenuation of noise in ultra sensitive signaling cascades. Biophys. J. , 82:2943–2950, 2002.
4(4) A. Eldar and M. B. Elowitz. Functional roles for noise in genetic circuits. Nature , 467:167–173, 2010.
5(5) A. Raj and A. van Oudenaarden. Nature, nurture, or chance: Stochastic gene expression and its consequences. Cell , 135:216–226, 2008.
6(6) N. Maheshri and E. K. O’Shea. Living with noisy genes: How cells function reliably with inherent variability in gene expression. Ann. Rev. Biophys. Biomol. Struct. , 36:413–434, 2007.
7(7) C. G. Bowsher, M. Voliotis, and P. S. Swain. The fidelity of dynamic signaling by noisy biomolecular networks. PLOS Computational Biology , 9:e 1002965, 2013.
8(8) L. Cai, C. K. Dalal, and M. B. Elowitz. Frequency modulated nuclear localization bursts coordinate gene regulation. Nature , 455:485, 2008.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Optimal information transfer in enzymatic networks: A field theoretic formulation

Abstract

I Introduction:

II Theory:

Linear Push-Pull Network:

Field theoretic formulation:

III Applications:

A. Push-Pull network:

B. Signaling Cascades:

C. Noise filtering with time delay:

D. Deriving the time delay α\alphaα by mapping onto a three-species pathway with an intermediate:

E. Enzymatic Push-Pull Loop:

IV Concluding Remarks:

Appendix A Appendix A: Second order loop correction to the signaling error for the push-pull network:

Appendix B Appendix B: Action for enzymatic push-pull network:

D. Deriving the time delay $\alpha$ by mapping onto a three-species pathway with an intermediate: