Dynamic metabolic resource allocation based on the maximum entropy   principle

David S. Tourigny

arXiv:1906.03919·q-bio.MN·July 8, 2020

Dynamic metabolic resource allocation based on the maximum entropy principle

David S. Tourigny

PDF

TL;DR

This paper develops a dynamic metabolic model based on maximum entropy principles, unifying existing approaches and explaining cellular strategies under environmental uncertainty, with practical implications for metabolic engineering.

Contribution

It introduces a novel framework combining maximum entropy with optimal control for dynamic metabolic modeling, extending previous models to account for uncertainty and heterogeneity.

Findings

01

Describes bet-hedging strategies in cell populations.

02

Models resource allocation and reserve accumulation.

03

Aligns with observed yeast growth behaviors.

Abstract

Organisms have evolved a variety of mechanisms to cope with the unpredictability of environmental conditions, and yet mainstream models of metabolic regulation are typically based on strict optimality principles that do not account for uncertainty. This paper introduces a dynamic metabolic modelling framework that is a synthesis of recent ideas on resource allocation and the powerful optimal control formulation of Ramkrishna and colleagues. In particular, their work is extended based on the hypothesis that cellular resources are allocated among elementary flux modes according to the principle of maximum entropy. These concepts both generalise and unify prior approaches to dynamic metabolic modelling by establishing a smooth interpolation between dynamic flux balance analysis and dynamic metabolic models without regulation. The resulting theory is successful in describing `bet-hedging'…

Tables1

Table 1. Table 1 : Values for parameters used in all simulations. Initial conditions and values for D 𝐷 D , k L a subscript 𝑘 𝐿 𝑎 k_{L}a , G 0 subscript 𝐺 0 G_{0} , and σ 𝜎 \sigma are reported in the legend of Figure 4 .

Parameters	Value
$V_{1}^{m a x}, V_{2}^{m a x}, V_{3}^{m a x}, V_{4}^{m a x}, V_{5}^{m a x}$	$1.0$ $h^{- 1}$
$V_{6}^{m a x}$	$2.5$ $h^{- 1}$
$c_{1}$	$0.02$ $g \cdot g^{- 1}$
$c_{3}$	$0.34$ $g \cdot g^{- 1}$
$K_{1}, K_{2}, K_{3}, K_{6}$	$0.01$ $g \cdot L^{- 1}$
$K_{4}, K_{5}$	$0.01$ $g \cdot L^{- 1} \cdot g^{- 1} \cdot L$
$K_{O, 2}, K_{O, 3}, K_{O, 5}$	$0.001$ $g \cdot L^{- 1}$
$O^{*}$	$0.015$ $g \cdot L^{- 1}$

Equations200

\frac{d}{d t} m_{e x} \frac{d}{d t} m_{in} \frac{d}{d t} x = S_{e x} v x = S_{in} v - μ m_{in} = μx, μ = c^{T} v .

\frac{d}{d t} m_{e x} \frac{d}{d t} m_{in} \frac{d}{d t} x = S_{e x} v x = S_{in} v - μ m_{in} = μx, μ = c^{T} v .

i = 1 \sum N e_{i} \leq 1,

i = 1 \sum N e_{i} \leq 1,

\mbox ma x J = Φ_{t = t_{f}} (m_{e x}, m_{in}, x) + \int_{t_{0}}^{t_{f}} L (m_{e x}, m_{in}, x, e) d t \mbox s . t . (\ref sy s t e m) an d (\ref co n s t r ain t)

\mbox ma x J = Φ_{t = t_{f}} (m_{e x}, m_{in}, x) + \int_{t_{0}}^{t_{f}} L (m_{e x}, m_{in}, x, e) d t \mbox s . t . (\ref sy s t e m) an d (\ref co n s t r ain t)

\frac{d}{d t} m_{e x} 0 \frac{d}{d t} x = S_{e x} v x = S_{in} v = μx, μ = c^{T} v .

\frac{d}{d t} m_{e x} 0 \frac{d}{d t} x = S_{e x} v x = S_{in} v = μx, μ = c^{T} v .

\mbox ma x J^{r e d} \mbox s . t . (\ref r e d u ce d) an d (\ref co n s t r ain t),

\mbox ma x J^{r e d} \mbox s . t . (\ref r e d u ce d) an d (\ref co n s t r ain t),

v = k = 1 \sum K λ_{k} Z^{k}, λ_{k} \geq 0 k = 1, 2, ..., K .

v = k = 1 \sum K λ_{k} Z^{k}, λ_{k} \geq 0 k = 1, 2, ..., K .

e_{i} = k = 1 \sum K λ_{k} \frac{Z _{i}^{k}}{f _{i} ( m )}, i = 1, 2, ..., N

e_{i} = k = 1 \sum K λ_{k} \frac{Z _{i}^{k}}{f _{i} ( m )}, i = 1, 2, ..., N

1 \geq k = 1 \sum K λ_{k} i = 1 \sum N \frac{Z _{i}^{k}}{f _{i} ( m )} \equiv k = 1 \sum K u_{k}, u_{k} \geq 0 k = 1, 2, ..., K

1 \geq k = 1 \sum K λ_{k} i = 1 \sum N \frac{Z _{i}^{k}}{f _{i} ( m )} \equiv k = 1 \sum K u_{k}, u_{k} \geq 0 k = 1, 2, ..., K

r_{k} (m) = (i = 1 \sum N \frac{Z _{i}^{k}}{f _{i} ( m )})^{- 1}

r_{k} (m) = (i = 1 \sum N \frac{Z _{i}^{k}}{f _{i} ( m )})^{- 1}

\frac{d}{d t} m_{e x} \frac{d}{d t} x = x k = 1 \sum K r_{k} (m_{e x}) S_{e x} Z^{k} u_{k} = x k = 1 \sum K r_{k} (m_{e x}) c^{T} Z^{k} u_{k}

\frac{d}{d t} m_{e x} \frac{d}{d t} x = x k = 1 \sum K r_{k} (m_{e x}) S_{e x} Z^{k} u_{k} = x k = 1 \sum K r_{k} (m_{e x}) c^{T} Z^{k} u_{k}

\mbox ma x J^{r e d} \mbox s . t . (\ref r e d u ce d 2) an d (\ref co n s t r ain t 2) .

\mbox ma x J^{r e d} \mbox s . t . (\ref r e d u ce d 2) an d (\ref co n s t r ain t 2) .

\frac{d G _{e x}}{d t}

\frac{d G _{e x}}{d t}

\frac{d O}{d t}

\frac{d P _{1}}{d t}

0

0

\frac{d x}{d t}

Z^{1} = 11200, Z^{2} = 00011, Z^{3} = 11020,

Z^{1} = 11200, Z^{2} = 00011, Z^{3} = 11020,

r_{1} (G_{e x}) = (\frac{1}{f _{0} ( G _{e x} )} + \frac{1}{f _{1} ( G _{in}^{*} )} + \frac{2}{f _{2} ( P ^{*} )})^{- 1}

r_{1} (G_{e x}) = (\frac{1}{f _{0} ( G _{e x} )} + \frac{1}{f _{1} ( G _{in}^{*} )} + \frac{2}{f _{2} ( P ^{*} )})^{- 1}

r_{2} (O, P_{1}) = (\frac{1}{f _{3} ( O , P ^{*} )} + \frac{1}{f _{4} ( P _{1} )})^{- 1}

r_{2} (O, P_{1}) = (\frac{1}{f _{3} ( O , P ^{*} )} + \frac{1}{f _{4} ( P _{1} )})^{- 1}

r_{3} (G_{e x}, O) = (\frac{1}{f _{0} ( G _{e x} )} + \frac{1}{f _{1} ( G _{in}^{*} )} + \frac{2}{f _{3} ( O , P ^{*} )})^{- 1}

r_{3} (G_{e x}, O) = (\frac{1}{f _{0} ( G _{e x} )} + \frac{1}{f _{1} ( G _{in}^{*} )} + \frac{2}{f _{3} ( O , P ^{*} )})^{- 1}

\frac{d G _{e x}}{d t}

\frac{d G _{e x}}{d t}

\frac{d O}{d t}

\frac{d P _{1}}{d t}

\frac{d x}{d t}

\frac{d}{d τ} Δ X = F (X (t), u^{0}) + A Δ X + B Δ u

\frac{d}{d τ} Δ X = F (X (t), u^{0}) + A Δ X + B Δ u

A = \frac{\partial}{\partial X} F (X (t), u^{0}), B = \frac{\partial}{\partial u} F (X (t), u^{0})

A = \frac{\partial}{\partial X} F (X (t), u^{0}), B = \frac{\partial}{\partial u} F (X (t), u^{0})

Δ J = q^{T} Δ X (t + Δ t) + σ \int_{t}^{t + Δ t} H (u) d τ,

Δ J = q^{T} Δ X (t + Δ t) + σ \int_{t}^{t + Δ t} H (u) d τ,

Δ J = J^{r e d} (t + Δ t) - J^{r e d} (t), q = \frac{\partial ϕ}{\partial X} (X (t)),

Δ J = J^{r e d} (t + Δ t) - J^{r e d} (t), q = \frac{\partial ϕ}{\partial X} (X (t)),

H (u) = - k = 1 \sum K u_{k} lo g (u_{k}) .

H (u) = - k = 1 \sum K u_{k} lo g (u_{k}) .

\mbox ma x Δ J \mbox s . t . (\ref l in e a r) an d k = 1 \sum K u_{k} = 1, u_{k} \geq 0 k = 1, 2, ..., K

\mbox ma x Δ J \mbox s . t . (\ref l in e a r) an d k = 1 \sum K u_{k} = 1, u_{k} \geq 0 k = 1, 2, ..., K

u_{k} (t) = \frac{1}{Q} exp (\frac{1}{σ} q^{T} e^{A Δ t} B^{k}) .

u_{k} (t) = \frac{1}{Q} exp (\frac{1}{σ} q^{T} e^{A Δ t} B^{k}) .

Q = k = 1 \sum K exp (\frac{1}{σ} q^{T} e^{A Δ t} B^{k}) .

Q = k = 1 \sum K exp (\frac{1}{σ} q^{T} e^{A Δ t} B^{k}) .

R_{Δ t}^{k} = q^{T} e^{A Δ t} B^{k}

R_{Δ t}^{k} = q^{T} e^{A Δ t} B^{k}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Dynamic metabolic resource allocation based on the maximum entropy principle

David S. Tourigny [email protected] Columbia University Irving Medical Center

630 West 168th Street, New York, NY 10032 USA

Abstract

Organisms have evolved a variety of mechanisms to cope with the unpredictability of environmental conditions, and yet mainstream models of metabolic regulation are typically based on strict optimality principles that do not account for uncertainty. This paper introduces a dynamic metabolic modelling framework that is a synthesis of recent ideas on resource allocation and the powerful optimal control formulation of Ramkrishna and colleagues. In particular, their work is extended based on the hypothesis that cellular resources are allocated among elementary flux modes according to the principle of maximum entropy. These concepts both generalise and unify prior approaches to dynamic metabolic modelling by establishing a smooth interpolation between dynamic flux balance analysis and dynamic metabolic models without regulation. The resulting theory is successful in describing ‘bet-hedging’ strategies employed by cell populations dealing with uncertainty in a fluctuating environment, including heterogenous resource investment, accumulation of reserves in growth-limiting conditions, and the observed behaviour of yeast growing in batch and continuous cultures. The maximum entropy principle is also shown to yield an optimal control law consistent with partitioning resources between elementary flux mode families, which has important practical implications for model reduction, selection, and simulation.

1 Introduction

Dynamic models of metabolism have been introduced as extensions to static, steady state modelling techniques such as flux balance analysis (FBA) [1, 2] and elementary flux mode analysis [3] in order to describe adaptation of cellular activity to changes in the environment. Established examples include dynamic FBA (DFBA) [4], macroscopic bioreaction models [5, 6], and cybernetic theory based on the optimal control framework by Young and Ramkrishna [7, 8]. Both DFBA and cybernetic theory incorporate regulation of flux across a metabolic reaction network based on some optimality criteria, whereas macroscopic bioreaction models do not and are therefore considered unregulated. Young and Ramkrishna [7, 8] posit that regulatory decisions take the form of a constrained optimisation problem, which must be solved to optimally distribute limited resources among pathways in the network. More recently, various related extensions of DFBA based on resource allocation have been introduced to accommodate the limited capacity for gene expression into dynamic models of metabolism (e.g. [9, 10, 11]). The concept of resource allocation has also been considered in the static case [12, 13, 14, 15, 16], where it is suggested that resource constraints arise due to finiteness of the total cellular proteome and the fraction that corresponds to metabolic enzymes. In [13, 14], it was shown that the FBA solution to the resource allocation problem is to allocate the entirety of resource exclusively to the metabolic pathway maximising the cellular objective.

From a strategic perspective, cell populations may instead prefer to spread resource among multiple metabolic pathways in order to deal with uncertainty in a fluctuating environment, which could explain the heterogeneity in metabolic pathway use observed experimentally [17, 18, 19, 20, 21]. Such ‘bet-hedging’ arguments are akin to various economic theories [22, 23, 24] that posit multiple investments are beneficial to individuals subjected to uncertainty, or that individuals make exclusive investments, but in receipt of slightly different information. These theories are related to the principle of maximum entropy [25, 26], because from an information-theoretic standpoint the resource distribution that best represents the current state of knowledge is the one with largest entropy: entropy uniquely satisfies the accepted axioms for an uncertainty measure (up to a constant factor) [27], and therefore the maximum entropy distribution consistent with known constraints is uniquely determined as the one that expresses maximum uncertainty with respect to everything else. In biology, this mathematical justification for maximum entropy as an investment strategy that best-accommodates uncertainty forms the basis of various ecological theories (see [28] for a review), and also an interpretation of stem cell multi-potency [29]. Analogously, in [30] it was demonstrated that phenotype-switching strategies that are adjusted to the entropy of environmental fluctuations can outperform those that are not. The maximum entropy principle has also recently been applied to static metabolic modelling in various scenarios, including: extensions of FBA to include population heterogeneity [31, 32], experimental decomposition of fluxes using elementary mode analysis [33, 34], and to put forward the suggestion that organisms evolve toward a state of maximum physical entropy [35, 36]. Hitherto, there has been no attempt to incorporate the maximum entropy principle into dynamic models of metabolism with resource allocation.

This paper builds upon the work of Young and Ramkrishna [7, 8] with the purpose of introducing a dynamic model of metabolic resource allocation based on the maximum entropy principle. Although a similar optimal control framework is employed and metabolic network decomposition is also performed using elementary flux modes (EFMs) [3], optimality criteria for resource allocation are instead stated in terms of maximum entropy so as to accommodate environmental uncertainty, which produces an original control law. Moreover, the resulting theory is not cybernetic in the sense that there is no reliance on multiple control laws, nor are cybernetic enzymes introduced as auxiliary dynamical variables. The maximum entropy framework unifies DFBA [4] and unregulated macroscopic bioreaction models [5, 6] as two limiting extremes of the general theory. A further consequence for dynamic resource allocation is that no assumption beyond maximisation of total catalytic biomass is necessary to describe accumulation of cellular reserve compounds in growth-limiting environments [37, 38]. The maximum entropy control also turns out to be consistent with model reduction using EFM families [39, 40, 41], which is extremely useful from a modelling point of view because EFM enumeration can result in a combinatorial explosion as metabolic networks grow in size [42].

The remainder of this paper is organised as follows: Sections 2 and 3 introduce the dynamic metabolic model and maximum entropy control, which are then extended to include metabolite yields in Section 4. Section 5 describes the dynamic maximum entropy framework as applied to model reduction using EFM families, and Section 6 presents a specific application of the theory to yeast metabolism. This application is the culmination of a number of working examples found at the end of each section. Additional mathematical details expanding on some parts of the main text can be found in the Appendix.

2 Dynamic model of metabolism

The following dynamical system is considered as a model for metabolism in batch culture

[TABLE]

Here $\mathbf{m}_{ex}$ ( $\mbox{g}\cdot\mbox{L}^{-1}$ ), $\mathbf{m}_{in}$ ( $\mbox{g}\cdot\mbox{L}^{-1}\cdot\mbox{gDW}^{-1}\cdot\mbox{L}$ ; gDW, grams dry weight) are vectors of extra- and intracellular metabolites, respectively, and $\mathbf{S}_{ex}$ , $\mathbf{S}_{in}$ the corresponding portions of the stoichiometric reaction matrix $\mathbf{S}$ [1, 2, 3]. The scalar variable $x$ ( $\mbox{gDW}\cdot\mbox{L}^{-1}$ ) represents the concentration of total catalytic biomass responsible for catalysing reactions involved in its own production and interconversion of metabolites, and $\mu$ ( $\mbox{h}^{-1}$ ) is the rate of its accumulation (i.e., growth rate) formed as the inner product of the non-negative, $N$ -dimensional flux vector $\mathbf{v}=(v_{1},v_{2},...,v_{N})^{T}$ ( $\mbox{g}\cdot\mbox{gDW}^{-1}\cdot\mbox{h}^{-1}$ ) with the constant coefficient vector $\mathbf{c}=(c_{1},c_{2},...,c_{N})^{T}$ ( $\mbox{gDW}\cdot\mbox{g}^{-1}$ ). When the reactions are irreversible (which can be assumed after splitting each reversible reaction into two irreversible ones), the reaction fluxes $v_{i}\geq 0$ can be decomposed as $v_{i}=e_{i}f_{i}(\mathbf{m})$ with $\mathbf{m}=(\mathbf{m}_{ex},\mathbf{m}_{in})^{T}$ , where $e_{i}$ is the relative concentration of the enzyme catalysing the $i$ th reaction and $f_{i}(\mathbf{m})$ is the (non-negative) ‘saturation function’ of that enzyme, which includes the thermodynamic driving force, (allosteric) activation or inhibition, and other enzyme-specific effects [16]. In formulation of this model the relative enzyme concentrations $e_{i}$ are understood to be control variables, whose values are determined according to control laws in order to satisfy some objective, as are additional arguments of $f_{i}$ (omitted for notational simplicity) responsible for regulatory effects not directly attributable to the relative level of enzyme $i$ . More precisely, while $e_{i}$ corresponds to relative levels of the $i$ th enzyme, its activity is dependent on substrate availability and additional regulatory features, e.g., covalent modification, all encapsulated in the form of a single function $f_{i}(\mathbf{m})$ . It is assumed these two types of control are enacted on distinct time scales so that the $e_{i}$ are considered slow control variables while the remaining control variables contained within $f_{i}(\mathbf{m})$ are considered fast. For the vast majority of biological models this is a realistic assumption, i.e. the process of transcription and translation of enzyme takes considerably larger than its post-translational regulation (e.g., via phosphorylation).

Following [43, 16], the control variables corresponding to enzyme levels satisfy the constraint

[TABLE]

which in this case could correspond to a limited capacity for protein synthesis on ribosomes. In [16], expressions like (2) come with a set of weights or ‘costs’, one for each $e_{i}$ , but here these are absorbed into the $f_{i}(\mathbf{m})$ (although this is only possible for the case of a single constraint). Regulation of the $e_{i}$ and remaining fast control variables appearing in (1) is assumed to occur such that some metabolic performance index $J$ is maximised, which combined with constraint (2) introduces the general optimal control problem for resource allocation over the interval $[t_{0},t_{f}]$ :

[TABLE]

where $\Phi_{t=t_{f}}$ is a terminal objective function and $L$ an intermediate objective function evaluated at $\mathbf{e}=(e_{1},e_{2},...,e_{N})^{T}$ . Initial conditions for the dynamic variables in (1) may also be given. From a modelling perspective however, it is conventionally not the case that the full system (1) is considered due to the immense number of unmeasurable parameters necessary to provide an accurate dynamical description of intracellular metabolism. Common practice is therefore to invoke the quasi-steady state assumption (QSSA) on intracellular metabolism [1, 3], which amounts to the assumption that metabolic transients are typically rapid compared to cellular growth rates and changes in the environment. Validity for the QSSA is obtained by comparing the time scale of metabolic processes (fast) to those of transcriptional and translational regulation (slow) [44]. Assuming the dilution term $\mu\mathbf{m}_{in}$ is negligible for intracellular metabolites and invoking the QSSA reduces (1) to a lower-dimensional system of the form

[TABLE]

There are two critical issues that should be called into question at this stage. First, the reduction of (1) to (4) based on the QSSA is formal, but it can be rigorously proven that, for fixed relative enzyme concentrations $e_{i}$ , trajectories of the ordinary differential equation (4) are a good approximation for those of (1) provided the conditions of Tikhonov’s theorem are met [45]. These conditions are almost always impossible to validate however, and so typically one needs to assume existence and stability of a quasi-steady state based on biophysical insight. See [10] for a discussion of this point. Secondly, and this is not discussed in [10], it is natural to approximate solutions to the optimal control problem (3) using solutions to the reduced problem

[TABLE]

where it is understood that $J^{red}$ is the metabolic performance index evaluated on trajectories of the reduced system (4). However, to establish validity of this approximation one must appeal to the theory of singularly perturbed optimal control problems [46] and prove that Pontryagin’s maximum conditions for the reduced problem (5) are equivalent to those obtained by invoking the QSSA on Pontryagin’s maximum conditions for the full control problem (3). Unfortunately, establishing equivalence of these two reduction methods remains an open problem for most nonlinear systems. That this equivalence is approximately satisfied should therefore be highlighted as an additional biological assumption for optimal control problems such as those considered here and in [10]. The assumption of this equivalence will be referred to as the quasi-reduction equivalent assumption (QREA).

Proceeding under the condition that both the QSSA and QREA are valid, a complete set of vectors $\{\mathbf{Z}^{k}\}_{k=1,2,...,K}$ representing EFMs [3] or extremal rays for the flux cone $FC=\{\mathbf{v}:\mathbf{S}_{in}\mathbf{v}=0,v_{i}\geq 0\quad\forall i\}$ can be used to express any $\mathbf{v}\in FC$ as a conical combination

[TABLE]

The $\mathbf{Z}^{k}$ are defined up to some multiplicative constant and the decomposition (6) represents any $\mathbf{v}$ satisfying constraints imposed by the intracellular component of the stoichiometric matrix [3]. As reviewed in [47], there can also be thermodynamic constraints on $\mathbf{v}$ , and restricting the set of EFMs to those that satisfy these additional constraints has recently been achieved in [48]. The summation in (6) may therefore be restricted to a subset of thermodynamically-feasible EFMs, because any thermodynamically-feasible $\mathbf{v}$ can be expressed solely in terms of thermodynamically-feasible EFMs [49]. The converse statement however, that any $\mathbf{v}$ expressed as a linear combination of thermodynamically-feasible EFMs also satisfies the thermodynamic constraints, is not necessarily true, and so this puts a restriction on the interpretation of EFM-based dynamic modelling approaches [5, 6, 39, 40, 41, 50]. From the decomposition $v_{i}=e_{i}f_{i}(\mathbf{m})$ one obtains (provided $f_{i}(\mathbf{m})\neq 0$ )

[TABLE]

where $Z^{k}_{i}$ is the $i$ th element of the vector representing the $k$ th EFM. Therefore constraint (2) becomes

[TABLE]

where the new slow control variables $u_{k}=\lambda_{k}/r_{k}$ have been introduced along with

[TABLE]

as the ‘composite’ flux through the $k$ th EFM (compare with [8] in the dynamic and [13, 14] in the static case).

At this point a choice needs to made for the way that the composite fluxes are to be represented in the reduced system. This is because the QSSA applied to the optimal control problem (3) using the decomposition (6) to express $\mathbf{v}$ in terms of EFMs does not follow Tikhonov’s theorem for ordinary differential equations, which is rather based on determining the slow manifold for $\mathbf{m}_{in}$ in terms of $\mathbf{m}_{ex}$ . Common practice is to approximate the functional form of composite fluxes using (e.g., Michaelis-Menten) kinetic rate laws that depend on slow dynamic variables $\mathbf{m}_{ex}$ alone [5, 6, 39, 40, 41, 50]. This choice limits the total number of parameters in the reduced system, but comes with a requirement to select a common normalisation for all EFMs because otherwise the system will not remain invariant to EFM scaling. Such an approximation is made in Section 5 and the application to yeast metabolism presented in Section 6, while for the general discussion it will be assumed that composite fluxes can be well-defined using expression (8) with the fast variables fixed at some constant value, $\mathbf{m}_{in}^{*}$ , independent of the $\mathbf{m}_{ex}$ . Substitution for $\mathbf{v}$ in the reduced system (4) yields

[TABLE]

where only explicit dependence of the $r_{k}$ on $\mathbf{m}_{ex}$ has been included because the $\mathbf{m}_{in}$ are now assumed constant by the QSSA as stated above. Although the vectors representing EFMs are specified only up to a multiplicative constant, the system (9) remains invariant to their re-scaling and is therefore well-defined. Under the QREA one arrives at the reduced optimal control problem

[TABLE]

The above form of the dynamic resource allocation problem provides a natural interpretation for each control variable $u_{k}$ as the fraction of total catalytic biomass concentration $x$ that is allocated to the $k$ th EFM. The next section introduces the control law for determining the optimal fraction of this resource.

Example 1.

Consider the simplified metabolic network in Figure 1(a) as a model for central carbon metabolism, also chosen in [51].

Concentrations of extracellular metabolites glucose ( $G_{ex}$ ), oxygen ( $O$ ), product 1 ( $P_{1}$ ), product 2 ( $P_{2}$ ), and total catalytic biomass ( $x$ ) are slow variables, while concentrations of intracellular glucose ( $G_{in}$ ) and pyruvate ( $P$ ) are fast. In this model only reactions with fluxes $v_{1}$ and $v_{3}$ are assumed to contribute directly to the growth rate, such that $c_{0}=c_{2}=c_{4}=0$ with $c_{3}>c_{1}>0$ . Invoking the QSSA on fast intracellular metabolite concentrations, stoichiometric matrices $\mathbf{S}_{ex}$ and $\mathbf{S}_{in}$ give rise to the reduced dynamical system

[TABLE]

A complete set of three EFMs (represented graphically in Figure 1(b)) is provided by the vectors

[TABLE]

and these have respective composite fluxes

[TABLE]

where $G^{*}_{in}$ and $P^{*}$ denote the fixed steady state concentrations of intracellular glucose and pyruvate, respectively, and $f_{i}$ is the saturation function of the $i$ th enzyme. Expressed in terms of EFMs and slow control variables $u_{1},u_{2},u_{3}$ , the reduced dynamical system takes the form

[TABLE]

∎

3 Maximum entropy control

This section considers the nature of $J^{red}$ in the reduced optimal control problem (10). Related to the separation of timescales for metabolite concentrations in the QSSA arises a similar separation of timescales for control variables. Fast regulatory control variables appearing in $f_{i}$ are encapsulated within the $r_{k}(\mathbf{m}_{ex})$ , whereas the reduced system (9) has a linear dependence on slow control variables $u_{k}$ . In what follows, it will be assumed that fast control variables are selected instantaneously (relative to the QSSA) to yield optimal values of $r_{k}(\mathbf{m}_{ex})$ . On what basis optimality is defined for fast control variables is not of concern, but should derive from biologically reasonable principles. For example, instantaneous maximisation of the composite flux through each EFM amounts to a QSSA-based approximation of the optimal control policy in [8] where no separation of timescales was assumed. In this approach, a local objective for each EFM is used to determine optimal values for fast control variables that maximise the composite flux of each EFM individually, and subsequently the slower control variables are chosen to maximise a global objective combining all EFMs. Regardless of the instantaneous policy for selecting fast control variables, the optimal control problem (10) is stated so as to determine the $u_{k}$ assuming the $f_{i}$ in $r_{k}(\mathbf{m}_{ex})$ are given.

Combining a metabolic performance index $J^{red}$ that is linear in the $u_{k}$ with the constraint (7) would result in an optimal control law that allocates the entire fraction of resource exclusively to the EFM with highest return-on-investment [13, 14]. Such a control is the so-called FBA or Bang-Bang policy, which for a variety of evolutionary reasons does not appear to be the most robust nor economically efficient resource allocation strategy in the face of environmental fluctuations [7, 18, 19, 20, 21, 31, 32]. This motivates the revised concept [7, 11] that regulatory decisions for the control variables $u_{k}$ should be made based on the projected system response over a (short) time interval of length $\Delta t$ . In this sense optimal choices for the $u_{k}$ are anticipatory of the effects that slower regulatory processes such as transcription and translation will have in the immediate future. Collecting dynamical and control variables into vectors $\mathbf{X}=(\mathbf{m}_{ex},x)^{T}$ and $\mathbf{u}=(u_{1},u_{2},...,u_{K})^{T}$ , respectively, and writing $\dot{\mathbf{X}}=\mathbf{F}(\mathbf{X},\mathbf{u})$ , the linearisation of (9) about the state $\mathbf{X}(t)$ and a reference control input $\mathbf{u}^{0}$ may be assumed a good approximation to the system response at time $t+\tau$ for $\tau\in[0,\Delta t]$ [7]. Linearisation yields

[TABLE]

where

[TABLE]

and $\Delta\mathbf{X}(\tau)=\mathbf{X}(t+\tau)-\mathbf{X}(t)$ , $\Delta\mathbf{u}(\tau)=\mathbf{u}(t+\tau)-\mathbf{u}^{0}$ . When linearising and considering the change in performance index, Young and Ramkrishna [7] augmented the accrued benefit derived across during the planning window $[t,t+\Delta t]$ by a term quadratic in the $u_{k}$ representing the cost or penalty associated with resource allocation. This paper takes a different approach, which is to model the change in performance over the time interval as

[TABLE]

where

[TABLE]

and

[TABLE]

Here the function $\phi(\mathbf{X})$ represents the metabolic objective of the system and $\sigma$ is a positive parameter that will be interpreted below.

The above choice of $H(\mathbf{u})$ is based on using the maximum entropy principle as a guide for selecting control variables $u_{k}$ , which can be rationalised from several different perspectives: first, since the dynamical model (9) is stated in terms of total catalytic biomass of a population, maximum entropy has recently been proposed as an extension of FBA that is intended to capture heterogeneity of different allocation policies adopted by individuals within it [31, 32]. Sources of this heterogeneity include stochasticity in gene expression and phenotype-switching at the single-cell level [52], which can serve a functional purpose rather than simply reflecting noise tolerance, and provide a collective advantage to organisms living in fluctuating environments [17, 18, 19, 20, 21]. Second, by analogy with decision making problems in finance [22], ecology [28], and communication theory [53], distribution of resources according to the principle of maximum entropy is the best choice for maximising expected return-on-investment in the face of uncertainty. As described in the introduction, the maximum entropy principle mathematically captures this notion of bet-hedging because it yields a unique resource allocation strategy consistent with known constraints (e.g., expected return-on-investment given current environmental conditions) while capturing maximum uncertainty in everything else (e.g., future environmental fluctuations) [25, 26]. Indeed, this second point is intimately tied to the first because population heterogeneity is thought to be one way that cell populations have evolved to execute bet-hedging strategies [17, 18, 19, 54], where both the entropy of the environment [30] and gene expression profiles [29] are taken into consideration. Finally, flux decomposition using maximum entropy-weighted EFMs has already been suggested for experimental flux derivation [33, 34], or where there is a direct physical interpretation for entropy as that of a chemical reaction [35, 36]. The former approach uses the maximum entropy principle as it directly applies to model inference [25, 26], where uncertainty reflects incompleteness of experimental data and the best statistical model is the one most consistent with those observed. Correspondence of the information-theoretic maximum entropy principle considered here with the physicochemical maximum entropy principle in [35, 36] are beyond the scope of this paper, but form a deeper relationship between information theory, statistical mechanics, and thermodynamics [25].

It is reasonable to assume that biological systems evolve under selection for maximal fitness by exploiting the capability to fully utilise their resource (the same assumption is made in [7]). This implies the total summation constraint in (7) is satisfied as an exact equality and, because of the remaining non-negativity constraints, the vector $\mathbf{u}$ of resource fractions now can be interpreted as a discrete probability distribution across the EFMs. Applying Pontryagin’s maximum principle to the optimal control problem

[TABLE]

and setting $\tau=0$ as explained in Appendix A, results in the following alternative to the optimal control provided in [7]:

[TABLE]

Here $\mathbf{B}^{k}$ denotes the $k$ th column of $\mathbf{B}$ , $\mathbf{e}^{\mathbf{A}\Delta t}$ is the matrix exponential of $\mathbf{A}\Delta t$ , and the normalisation factor $Q$ is the partition function

[TABLE]

As described by Jaynes in [25], the control (15) is the Boltzmann distribution with $\sigma$ taking the place of temperature and effective return-on-investment

[TABLE]

for the $k$ th EFM taking the place of energy. Use of the adjective ‘effective’ will become clear shortly. In the limit $\sigma\to 0$ , the control (15) collapses to the Bang-Bang/FBA policy [13, 14] where all resource is allocated to the EFM with the greatest effective return-on-investment (16) (although this does imply the $u_{k}$ can change rapidly, whereas formally they should be treated as slow control variables). Conversely, $u_{k}\to 1/K$ $(\forall k=1,2,...,K)$ as $\sigma$ grows so that resource is partitioned equally among all EFMs in the limit $\sigma\to\infty$ . This indifferent distribution of resource among EFMs is equivalent to the unregulated macroscopic bioreaction models of Provost and Bastin [5, 6]. Clearly neither extreme is necessarily an ideal representation of the optimal regulatory process, and therefore $\sigma>0$ is taken to be finite so that the resource is allocated amongst EFMs according to their effective return-on-investment (larger getting more). What proportional majority of resource is awarded to the EFM with greatest effective return-on-investment is determined by the precise value of $\sigma$ , which is considered to be a parameter fine-tuned over the course of evolution.

The general definition of effective return-on-investment (16) depends on a specific choice of metabolic objective that throughout the remainder of this paper is assumed to be maximisation of total catalytic biomass, i.e. $\phi(\mathbf{X})=x$ , which results in $\mathbf{q}=(\mathbf{0},1)^{T}$ . The vector $\mathbf{B}^{k}$ is obtained by evaluating the derivative of $\mathbf{F}$ with respect to $u_{k}$ at $\mathbf{X}(t)$ , and since $\mathbf{F}$ is linear in $u_{k}$ this choice of $\phi$ results in

[TABLE]

In the first instance it is assumed that only immediate consequences of the injected control actions need to be considered when evaluating $\mathbf{u}$ , and therefore $\Delta t=0$ (the next section will consider non-zero choices of $\Delta t$ that involve additional complexity due the matrix exponential of $\mathbf{A}\Delta t$ ). By analogy with [7], when $\Delta t=0$ , the control law (15) will be termed the greedy maximum entropy control. This simplifying assumption, that the Jacobian matrix $\mathbf{A}$ does not appear in the effective return-on-investment (16), is mathematically equivalent to the biological statement that future changes in the environment are not taken into consideration when making regulatory decisions. As described in Section 4, higher order corrections to the effective return-on-investment could be accounted for by a biological mechanism that has evolved to anticipate such environmental changes, but with $\Delta t=0$ the greedy maximum entropy control serves to maximise expected return-on-investment given the current state of the environment but complete uncertainty about the future. Using the greedy maximum entropy control, the effective return-on-investment for the $k$ th EFM reduces to

[TABLE]

where notation $\mbox{R}_{0}^{k}(\mathbf{m}_{ex})$ has been introduced for the return-on-investment evaluated at zeroth order ( $\Delta t=0$ ). Multiplication of $\mbox{R}_{0}^{k}(\mathbf{m}_{ex})$ by $x$ gives the greedy effective return-on-investment $\mathcal{R}^{k}_{0}(\mathbf{m}_{ex})$ . Just as in the case of system (9), zeroth-order return-on-investment $\mbox{R}_{0}^{k}(\mathbf{m}_{ex})$ and the corresponding optimal control remain invariant to re-scaling of $\mathbf{Z}^{k}$ because this is cancelled by the same factor appearing in the composite flux $r_{k}$ (8). The greedy effective return-on-investment (18) for the $k$ th EFM is therefore proportional to a weighted harmonic mean of the $f_{i}(\mathbf{m}_{ex})$ multiplied by a weighted arithmetic mean of the $c_{i}$ . The weighting for the $k$ th EFM is provided by the $N$ components $Z^{k}_{i}$ and the conclusion is that the greatest proportion of resource is allocated to the EFM for which the product of these two means is the largest.

In contrast to the resource allocation rules obtained by Young and Ramkrishna [7], observe that the greedy maximum entropy control law (15) implies all EFMs, including those with with zero or negative zeroth-order return-on-investment, will be allocated a non-zero fraction of resource provided $x/\sigma$ remains finite. Spreading of resource between multiple pathways is known to be optimal for dealing with uncertainty in a non-deterministic environment [18, 19, 20, 21, 30, 31, 32], and investing in each EFM is a bet-hedging strategy analogous to those in behavioural economics [23, 24] that captures remaining uncertainty when it is not possible to anticipate future environmental conditions. Equipped only with knowledge about the current environment, allocating a small fraction of resource (e.g. fraction of the proteome) to EFMs not contributing directly to growth is not considered wasteful because there is always a small probability that one of these pathways will have a benefit in the future [55]. Higher order corrections to return-on-investment will be described in Section 4, but without additional information the greedy maximum entropy control law spreads the remaining fraction of resource indiscriminately between EFMs with zero zeroth-order return-on-investment. The remaining resource fraction will tend to be very small when a majority of resource is heavily concentrated on EFMs having large return-on-investment (that are relatively more likely to be of benefit), combined with the appearance of total catalytic biomass $x$ as an overall scaling factor in (18). As $x$ increases, it plays an opposing role to $\sigma$ in the control law (15), meaning the distribution of resources amongst EFMs will become more heavily concentrated on those yielding the greatest return-on-investment (i.e., optimal resource allocation approaches the Bang-Bang/FBA policy as $x\to\infty$ with $\sigma$ fixed). This observation aligns well with the suggestion that the maximum entropy distribution represents the cumulative behaviour of individuals within a (finite) population [31, 32], since the spread of the population distribution will tend to decrease as the number of individuals within it increases.

Example 2.

The greedy effective return-on-investments for the three EFMs in Example 1 are given by

[TABLE]

For illustrative purposes, assume that $r_{k}(\mathbf{m}_{ex})\to 1$ ( $k=1,2,3$ ) as all extracellular metabolite concentrations become saturating. This implies that when $G_{ex}$ , $O$ , and $P_{1}$ are very large the greedy maximum entropy control law gives

[TABLE]

where $Q=e^{xc_{1}/\sigma}+e^{xc_{3}/\sigma}+e^{x(c_{1}+2c_{3})/\sigma}$ . Since $c_{3}>c_{1}$ , in this case the EFM represented by $\mathbf{Z}^{3}$ receives the greatest fraction of resource, followed by that represented by $\mathbf{Z}^{2}$ , and finally the EFM represented by $\mathbf{Z}^{1}$ receives the smallest fraction. On the other hand, if $O$ becomes very small while $G_{ex}$ and $P_{1}$ remain saturating, i.e., oxygen concentrations become limiting, instead $\mathcal{R}_{0}^{1}>>\mathcal{R}_{0}^{2}\approx\mathcal{R}_{0}^{3}$ and in this case the majority of resource is allocated to the EFM represented by $\mathbf{Z}^{1}$ . Conversely, if glucose concentrations $G_{ex}$ become limiting while $O$ and $P_{1}$ remain saturating, this results in $\mathcal{R}_{0}^{2}>>\mathcal{R}_{0}^{1}\approx\mathcal{R}_{0}^{3}$ and the majority of resource is allocated to the EFM represented by $\mathbf{Z}^{2}$ .

∎

4 Metabolite yields and anticipatory regulation

In previous sections, $\mathbf{m}_{ex}$ was used to denote the concentrations of extracellular metabolites, assuming that all intracellular metabolites are considered fast and therefore approximately constant at any instantaneous moment in time by the QSSA. This neglected the possibility that certain intracellular metabolites may not satisfy the QSSA criteria and instead vary on the slow timescale associated with $\mathbf{m}_{ex}$ and $x$ . Examples of slowly varying intracellular metabolites include storage compounds (see recent work [9, 37, 38]), which are suggested to increase growth rate across a time interval that includes several diverse environmental extremes, e.g., a 24h day-night epoch or feast-famine cycle. Such rationale may explain the regulation of storage pathways in organisms found in environments with predictable dynamics, but fails to describe the general patterns of accumulation and utilisation outside of this regime. As a relevant example, consider the case of intracellular carbohydrate reserves [56, 57, 58, 59]. The observed accumulation of intracellular carbohydrates in response to nutrient limitation is not intuitively rationalised based on choosing a metabolic objective of maximising total catalytic biomass alone, because investing resources in any process not contributing directly to growth would be considered a sub-optimal control policy. For this reason, among others, authors have considered alternative metabolic objectives, such as maximising total carbon uptake, or have explicitly included intracellular reserves as an integral component of biomass [60, 61, 62]. However, here it is demonstrated that no further assumption beyond $\phi(\mathbf{X})=x$ (the metabolic objective of maximising total catalytic biomass described in Section 3) is necessary for explaining the accumulation of storage compounds in response to nutrient limitation. The discussion also involves evaluating the optimal control law (15) with $\Delta t>0$ , which by analogy with [7] is called the temporal maximum entropy control.

To make the exposition more concrete, it will be useful to distinguish between two types of EFMs as suggested in Section 3: those that contribute directly to growth, such that $\mathbf{c}^{T}\mathbf{Z}^{k}>0$ ; and those that do not, such that $\mathbf{c}^{T}\mathbf{Z}^{k}=0$ . The case $\mathbf{c}^{T}\mathbf{Z}^{k}<0$ is excluded from consideration, but this is not a particularly restrictive assumption because in the vast majority of models all $c_{i}$ will be non-negative as are, necessarily, all vector components $Z^{k}_{i}$ . Typical control laws based on the choice $\phi(\mathbf{X})=x$ , like the FBA/Bang-Bang policy and the greedy control law of Young and Ramkrishna [7], preclude the allocation of resources to EFMs with $\mathbf{c}^{T}\mathbf{Z}^{k}=0$ since then $\mathcal{R}_{0}^{k}(\mathbf{m}_{ex})=0$ also. These control policies therefore neglect possible benefits of allocating resources to EFMs contributing to processes other than growth directly, such as accumulation of storage compounds that may be utilised for growth should environmental conditions become unfavourable. In Section 3, it was shown that the greedy maximum entropy control allocates a fraction of resource to every EFM, including those with $\mathbf{c}^{T}\mathbf{Z}^{k}=0$ , which accounts for maximal uncertainty when only information about the current environment is available. Correspondingly, the fraction of resource allocated to EFMs with $\mathbf{c}^{T}\mathbf{Z}^{k}=0$ will tend to increase as the average effective return-on-investment of EFMs with $\mathbf{c}^{T}\mathbf{Z}^{k}>0$ decreases. On the other hand, the Jacobian matrix $\mathbf{A}$ appears in the effective return-on-investment (16) of the temporal maximum entropy control, which is equivalent to the biological statement that regulatory decisions also take into consideration effects that the control action will have on the environment in the immediate future. If, for example, a system has evolved to anticipate that formation of storage compounds provides a future opportunity to increase total catalytic biomass, this further reduction in uncertainty is accommodated into the temporal maximum entropy control. One consequence is that individual EFMs not contributing directly to growth can receive greater (or less) investment should they involve consumption or production of metabolites that make the future environment more (or less) favourable for growth. This is most clearly demonstrated by understanding the higher order corrections to return-on-investment that arise for small $\Delta t>0$ .

When $\Delta t$ is small, the matrix exponential $\mathbf{e}^{A\Delta t}$ can be approximated to first order so that the effective return-on-investment (16) becomes $\mathcal{R}^{k}_{\Delta t}(\mathbf{m}_{s})\approx x[R_{0}^{k}(\mathbf{m}_{s})+\Delta tR_{1}^{k}(\mathbf{m}_{s})]$ , where the first-order correction to return-on-investment derived in Appendix B is

[TABLE]

Here $\mathbf{m}_{s}$ is used to denote all slow metabolite concentrations with corresponding stoichiometric matrix $\mathbf{S}_{s}$ and

[TABLE]

is an average of the zeroth-order return-on-investment (i.e., the average contribution to growth rate) provided by components of the reference control $\mathbf{u}^{0}$ at time $t$ . Observe that when the reference control is taken to be the uniform one (as suggested by Young and Ramkrishna [7]), corresponding to the $\sigma\to\infty$ limit of the maximum entropy control, then $\bar{R}_{0}(\mathbf{m}_{s})$ is simply the arithmetic mean of the $R^{k}_{0}(\mathbf{m}_{s})$ . There are two terms in the first-order correction to return-on-investment (19): the first is the product $\bar{R}_{0}(\mathbf{m}_{s})R^{k}_{0}(\mathbf{m}_{s})$ , which is always non-negative and obviously large when both $\bar{R}_{0}(\mathbf{m}_{s})$ and $R^{k}_{0}(\mathbf{m}_{s})$ are large; the second term is

[TABLE]

To understand the newly defined quantity $Y_{k}(\mathbf{m}_{s})$ , note that for any slow metabolite concentration $m$ one has

[TABLE]

Vector components of the form (21), one for each slow metabolite, provide a measure of how dependent the average contribution to growth rate is on concentration $m$ at time $t$ . If $r_{k}$ is monotonically increasing (which is true if all $f_{i}$ are monotonically increasing) then each component is non-negative and a large value of (21) indicates that a change in $m$ leads to a relatively large increase of $\bar{R}_{0}(\mathbf{m}_{s})$ , i.e., $m$ is growth-limiting at time $t$ ; conversely, a value close to zero indicates a change in the concentration of that metabolite has a negligible effect, i.e., $m$ is not growth-limiting at time $t$ . In general, it will not always be true that the $f_{i}$ are monotonically increasing, in which case some of the components (21) can be negative indicating certain slow metabolite concentrations may be growth-prohibiting. In either case, values (21) serve to weight components of the vector $\mathbf{S}_{s}\mathbf{Z}^{k}$ , which can be either positive or negative because they provide the yield of each metabolite for the $k$ th EFM. A positive yield indicates the $k$ th EFM will contribute to the production of a metabolite, while a negative yield means the EFM will contribute to its consumption. $Y_{k}(\mathbf{m}_{s})$ as defined in (20) is therefore interpreted as the total metabolite yield for the $k$ th EFM, with weighting of each individual metabolite yield proportional to the relative ability of the corresponding metabolite to increase $\bar{R}_{0}(\mathbf{m}_{s})$ . The relative sizes of $\bar{R}_{0}(\mathbf{m}_{s})R^{k}_{0}(\mathbf{m}_{s})$ and $xY_{k}(\mathbf{m}_{s})$ determine whether the first-order correction $R^{k}_{1}(\mathbf{m}_{s})$ is positive or negative. A positive first-order correction to the return-on-investment on implies $\mathcal{R}^{k}_{\Delta t}(\mathbf{m}_{s})>\mathcal{R}^{k}_{0}(\mathbf{m}_{s})$ whereas a negative correction means that $\mathcal{R}^{k}_{\Delta t}(\mathbf{m}_{s})<\mathcal{R}^{k}_{0}(\mathbf{m}_{s})$ .

Consequences of using the temporal maximum entropy control can then be summarised as follows: when no slow metabolites are growth-limiting or growth-prohibiting, the average contribution to growth rate $\bar{R}_{0}(\mathbf{m}_{s})$ is large relative to magnitudes of the $xY_{k}(\mathbf{m}_{s})$ , and therefore resource becomes further concentrated on EFMs with $\mathbf{c}^{T}\mathbf{Z}^{k}>0$ . However, when one or more slow metabolite is growth-limiting, and consequently the average contribution to growth rate $\bar{R}_{0}(\mathbf{m}_{s})$ is low, EFMs with non-negative total metabolite yield $Y_{k}(\mathbf{m}_{s})$ (such as those with $\mathbf{c}^{T}\mathbf{Z}^{k}=0$ ) can be allocated a larger fraction of resource than in cases where $\bar{R}_{0}(\mathbf{m}_{s})$ is high. This type of behaviour has been observed in most microbial populations [56, 57, 58, 59]. For example, the storage carbohydrate glycogen is produced by yeast upon limitations in extracellular carbon or nitrogen, and in bacteria glycogen accumulates under conditions of limiting growth when carbon is in excess but other nutrients are deficient (see [63] for a review). Also in yeast, up-regulation of trehaolse synthesis is known to serve as an indicator for cell populations with lower growth rates [17], which has been rationalised using a bet-hedging argument. The greedy maximum entropy control generates such an inverse correlation between average contribution to growth rate and the levels of activation of EFMs with $\mathbf{c}^{T}\mathbf{Z}^{k}=0$ , but only the temporal maximum entropy control distinguishes between them based on total metabolite yields and their ability to shape environmental conditions. In conclusion, both maximum entropy control laws account for accumulation of intracellular reserves under growth-limiting conditions without imposing any assumption on the objective other than maximisation of total catalytic biomass. However, the temporal maximum entropy control law describes some form of anticipatory regulation, whereas the greedy maximum entropy control law accommodates maximal uncertainty if only current environmental conditions are known.

Example 3.

Consider the simplified metabolic network in Figure 2(a) as an extension of the one introduced in Example 1.

The storage compound with concentration $C$ is introduced as a slow intracellular metabolite. Its consumption and production imply the addition of two reactions to the network with fluxes $v_{5}$ and $v_{6}$ , respectively, which do not to contribute directly to growth (i.e., $c_{5}=c_{6}=0$ ). The reduced system from Example 1 is extended to include an additional dynamical term for the new slow variable

[TABLE]

and the algebraic equations arising from the QSSA are modified to

[TABLE]

A complete set of six EFMs (represented graphically in Figures 2(b) and 2(c)) is provided by the vectors

[TABLE]

The composite EFM fluxes $r_{1},r_{2},r_{3}$ and greedy effective return-on-investments $\mathcal{R}_{0}^{1},\mathcal{R}_{0}^{2},\mathcal{R}_{0}^{3}$ are those given in Examples 1 and 2, respectively, while

[TABLE]

and

[TABLE]

Omitting explicit dependencies of the $r_{k}$ on slow metabolites to ease notation, the reduced system expressed in terms of EFMs and control variables $u_{k}$ is

[TABLE]

The metabolite yields for each EFM are supplied by the vectors

[TABLE]

and, assuming for simplicity that the oxygen concentration $O$ is saturating at time $t$ so that $\partial r_{k}(t)/\partial O=0$ ( $k=1,2,...,6$ ), the total metabolite yields evaluated using the uniform reference control $u^{0}_{k}=1/6$ ( $k=1,2,...,6$ ) are

[TABLE]

where $[\cdot]_{t}$ indicates the expression inside square parentheses is to be evaluated at $t$ . Observe that in a regime where metabolite concentrations are such that

[TABLE]

then $Y_{k}(t)<0$ for $k=1,2,...,5$ whereas $Y_{6}(t)>0$ . In fact, when oxygen is not saturating it can be shown that the contributions from non-zero derivatives $\partial r_{k}(t)/\partial O$ ( $k=2,3,5$ ) decrease $Y_{2},Y_{3},Y_{5}$ further while leaving $Y_{1},Y_{4},Y_{6}$ unchanged.

∎

5 Model reduction using EFM families

This section explores the practical aspects of model design and simulation. Enumeration of EFMs for large stoichiometry matrices $\mathbf{S}$ can lead to a combinatorial explosion as their number grows with increasing network size and connectivity [42]. This necessitates inclusion of many undetermined parameters and control variables in the reduced system (9), which are difficult to model accurately should sufficient experimental data not be available. Consequently, previous attempts to reduce the complexity of dynamic models like (9) have introduced rules for selecting a subset of relevant EFMs, or grouping EFMs into families to be considered together (e.g. [39, 40, 41]). An additional simplification is to approximate the composite EFM fluxes $r_{k}(\mathbf{m}_{s})$ by Michaelis-Menten kinetics, such that

[TABLE]

where $V^{max}_{k}$ , $\kappa_{a,k}$ are constants and the product includes all slow metabolite concentrations $m_{a}$ whose uptake fluxes are in the support of the $k$ th EFM. In what follows it is assumed that such an approximation (although not necessarily the Michaelis-Menten one) has been provided for the functional form of the $r_{k}(\mathbf{m}_{s})$ and that vectors representing EFMs have therefore been normalised to a common scale, such as total uptake carbon content. As described in Section 2, choosing a common normalisation for the EFM representative vectors is essential when the $r_{k}$ are approximated in this way because then (9) is no longer invariant to $\mathbf{Z}^{k}$ re-scaling. The focus of this section is to understand the effect of further model reduction, by grouping EFMs into families, on resource allocation from the perspective of the maximum entropy control. Rules for composing EFM families are not the object of consideration here, but could involve, for example [39, 40, 41], grouping together all EFMs whose support contain the same uptake flux.

Partitioning of EFMs into $M$ families means partitioning indices $k=1,2,...,K$ into $M$ mutually disjoint subsets $F_{J}$ (of size $N_{J}$ ) $J=1,2,...,M$ , and partitioning total resource into $M$ fractions $U_{J}$ such that

[TABLE]

Consequently, the EFM with index $j\in F_{J}$ is allocated a fraction $\tilde{u}_{j}=u_{j}/U_{J}$ of the resource $U_{J}$ available to the $J$ th family. These values are collected in vectors $\mathbf{U}=(U_{1},U_{2},...,U_{M})^{T}$ and $\mathbf{\tilde{u}}_{J}=(\tilde{u}_{F_{J,1}},\tilde{u}_{F_{J,2}},...,\tilde{u}_{F_{J,N_{J}}})^{T}$ , where $F_{J,i}$ denotes the $i$ th element of $F_{J}$ . Representative vectors $\mathbf{\tilde{Z}}^{J}$ ( $J=1,2,...,M$ ) are formed as weighted combinations of EFMs and used to express the dynamical system (9) in terms of EFM families. Several different weightings have previously been considered [39, 40, 41], but the entropy constraint identifies a particularly natural one to be

[TABLE]

where $\mathbf{\tilde{Z}}^{J}$ is defined given $\tilde{r}_{J}(\mathbf{m}_{s})$ , $\mathbf{\tilde{u}}_{J}$ , and the $r_{j}(\mathbf{m}_{s})$ ( $j\in F_{J}$ ). To understand resource allocation in terms of this partitioning, observe the optimal control (15) is obtained by maximisation of the objective functional

[TABLE]

where the effective return-on-investment $\mathcal{R}^{k}_{\Delta t}(\mathbf{m}_{s})$ is defined in (16). A classical result [25, 26] is that entropy satisfies the composition property

[TABLE]

where $H$ is defined as in (13) on components of each respective vector. As shown in Appendix C, combined with the weighting (22) this implies $\mathcal{F}(\mathbf{u})$ can be expressed as

[TABLE]

where $\mathcal{F}_{J}$ is the restriction of $\mathcal{F}$ to EFMs in the $J$ th family.

The full resource allocation problem can be viewed as a two-stage process involving an initial distribution of resource across EFM families, followed by further partitioning of the $U_{J}$ among their constituent EFMs [39, 40]. This procedure is best captured by maximising $\mathcal{F}(\mathbf{u})$ in two steps: first, the $\mathcal{F}_{J}(\mathbf{\tilde{u}}_{J})$ are maximised for each family, providing maximum entropy controls for EFMs in terms of $\mathbf{\tilde{u}}_{J}$ . Next, the resulting $\mathbf{\tilde{u}}_{J}$ are used for the EFM weightings (22) and $\mathcal{F}(\mathbf{u})$ is maximised with respect to the $U_{J}$ , yielding a maximum entropy control for EFM families. From a modelling perspective it is informative to consider the related objective functional

[TABLE]

where $\mathcal{\tilde{R}}^{J}_{\Delta t}(\mathbf{m}_{s})$ is the $J$ th family’s effective return-on-investment derived, as for individual EFMs in Section 3, directly from system (9) expressed in terms of EFM families. Maximisation of $\mathcal{F}(\mathbf{U})$ provides the maximal entropy control to be used if the constituents of EFM families are not known, and in this case $\mathcal{\tilde{R}}^{J}_{\Delta t}(\mathbf{m}_{s})$ should be a suitable approximation of $\mathcal{F}_{J}(\mathbf{\tilde{u}}_{J})$ . For reasons described previously however, even when an enumeration of EFMs exists one might want to simplify the calculation of optimal controls by approximating $\mathcal{F}_{J}(\mathbf{\tilde{u}}_{J})$ using EFMs in a manner consistent with maximisation. One way to do this is to set $r_{k}=x=1$ in the greedy effective return-on-investment (18) and use the resulting (fixed) greedy maximum entropy controls

[TABLE]

with $\eta_{j}\equiv\mathbf{c}^{T}\mathbf{Z}^{j}$ to express the reduced dynamical system (9) in terms of EFM family vectors $\mathbf{\tilde{Z}}^{J}=\sum_{j\in F_{j}}\tilde{u}_{j}\mathbf{Z}^{j}$ . The effective return-on-investment $\mathcal{\tilde{R}}^{J}_{\Delta t}(\mathbf{m}_{s})$ derived for the $J$ th family then takes the same form as (16), but with $B^{k}$ replaced by

[TABLE]

Maximisation of the objective functional $\mathcal{F}(\mathbf{U})$ in (26) therefore provides the maximum entropy control for optimal allocation of resource among EFM families represented by these $\mathbf{\tilde{Z}}^{J}$ . This approach is analogous to the EFM lumping proposed by Song and Ramkrishna [39, 40], where it is of relevance to note that their choice of EFM weighting also depends on a parameter $n_{v}$ that modulates spread across EFMs within a family, thus playing the equivalent of $\sigma$ in the maximal entropy control. In fact, their choice of fixed weighting $u_{j}=\eta_{j}^{n_{v}}$ is related to the fixed maximum entropy weighting $u_{j}\propto e^{\eta_{j}/\sigma}$ in the same way that maximum entropy relates to fuzzy clustering [64]. An alternative method for approximating $\mathbf{\tilde{Z}}^{J}$ , related to [41] and the maximum entropy control in the $\sigma\to 0$ limit, is to select the vector $\mathbf{Z}^{j}$ ( $j\in F_{J}$ ) representing the EFM with the largest return-on-investment (i.e., the FBA/Bang-Bang solution) in the $J$ th family.

Although EFM families and various EFM weightings have been described previously [39, 40, 41], the above discussion implicates their unification under the maximal entropy control framework. Stated in terms of a modelling endeavour to best approximate $\mathcal{F}_{J}(\mathbf{\tilde{u}}_{J})$ , choices for individual EFM return-on-investments and EFM family representatives can be derived from first principles. These approximations may involve any form of dynamic model reduction or simplification that respects the basic maximum entropy control laws for partitioning resources among EFMs within a family. Moreover, subsequent partitioning of resource among EFM families has always previously involved the original cybernetic control laws of Young and Ramkrishna [7], whereas for consistency the maximum entropy control should once more be used to determine the $U_{J}$ . Doing so makes it possible to recursively apply this form of model reduction when establishing the appropriate number of EFM families; the recursive nature of the maximum entropy control framework is succinctly captured by the objective functional (25).

Example 4.

Consider the metabolic network (Figure 2(a)) from Example 3, but now with $C$ interpreted as the concentration of a fast intracellular metabolite. The QSSA on $C$ imposes the additional algebraic constraint $v_{5}-v_{6}=0$ , which results in the collapse of one EFM (represented by vector $\mathbf{Z}^{6}$ in Example 3). A complete set of five EFMs for this network is therefore provided by the representative vectors

[TABLE]

each normalised by their uptake carbon content. Partitioning of these EFMs into families on the basis of their shared products and substrates means $F_{1}=\{1,4\}$ , $F_{2}=\{2\}$ , and $F_{3}=\{3,5\}$ , which gives rise to the three EFM families represented in Figure 3.

Setting $r_{k}=x=1$ ( $k=1,2,3,4,5$ ) in (18) yields the fixed return-on-investments

[TABLE]

and in this case the fixed greedy maximum entropy controls (27) for each family are independent of $\sigma$ :

[TABLE]

This results in the three EFM family representative vectors

[TABLE]

and using a Michaelis-Menten approximation for the $\tilde{r}_{J}$ one has

[TABLE]

Substituting for these expressions in (28) provides $\tilde{B}^{J}$ to be used for calculating the effective return-on-investment of the $J$ th EFM family and maximum entropy controls $U_{1},U_{2},U_{3}$ . Note that in this example, independence of $\sigma$ in the fixed greedy maximum entropy controls $\mathbf{\tilde{u}}_{1}$ , $\mathbf{\tilde{u}}_{2}$ , $\mathbf{\tilde{u}}_{3}$ implies they are equivalent to a set of FBA/Bang-Bang policies. Alternatively, one could assume some cost is associated with one of the reactions $v_{5}$ or $v_{6}$ (e.g., $c_{5}<0$ ), in which case $\eta_{1}>\eta_{4}$ and $\eta_{1}>\eta_{5}$ so that this equivalence would only hold in the $\sigma\to 0$ limit where $\mathbf{\tilde{Z}}^{1}=\mathbf{Z}^{1}$ and $\mathbf{\tilde{Z}}^{3}=\mathbf{Z}^{3}$ . In both cases the reduced dynamic model can be mapped on to that described in Example 1.

∎

6 Application to yeast metabolism

Here a dynamic model of resource allocation is introduced for central carbon metabolism in a single-celled eukaryotic organism, yeast, which builds upon the concepts and examples presented in previous sections. The metabolic network in Figure 2(a) describes the participating reactions, and product 1 is ethanol (concentration $E$ ), with $C$ the lumped concentration of storage carbohydrates glycogen and trehalose modelled in this case (contrast with Example 3) as an extracellular term that combines with total catalytic biomass $x$ to represent total biomass $x+C$ in the system. Note that without storage carbohydrates, the simplified network (Figure 1(a)) may be viewed as the reduction of a much larger network composed of glycolysis, the pentose-phosphate pathway, citric acid cycle, glyoxylate shunt, and oxidative phosphorylation, using EFM families exactly as was suggested by Song and Ramkrishna [39]. Introduction of the two reactions with fluxes $v_{5},v_{6}$ follows this same principle by grouping all reserve carbohydrate pathways into a single representative family. Precise details of EFM family reduction have been overlooked but, because of its recursive nature as described in the previous section, the relevant EFM families can be represented by the six vectors $\mathbf{Z}^{k}$ from Example 2 normalised by their uptake carbon content. The full dynamical system expressed in terms of these (with the subscript dropped from $G_{ex}$ ) is

[TABLE]

where a dilution rate $D$ and volumetric mass transfer coefficient $k_{L}a$ have been introduced for simulation of cultures with inflow glucose concentration $G_{0}$ and dissolved oxygen solubility limit $O^{*}$ .

The $r_{k}$ are approximated by Michaelis-Menten kinetics according to the metabolites whose uptake fluxes are in the support of each $\mathbf{Z}^{k}$ , such that

[TABLE]

Observe that composite fluxes $r_{4},r_{5}$ depend on the fractional concentration $C/x$ of storage carbohydrate. The zeroth-order return-on-investments are

[TABLE]

and the greedy maximum entropy control law for determining each $u_{k}$ is then

[TABLE]

with $\mathbf{m}=(G,O,E,C)^{T}$ and $\sigma>0$ . Numerical simulations of this system were performed using custom-built software based on the SUNDIALS solvers [65] and the fixed parameter values listed in Table 1. Note that parameter values have been chosen generically in the sense that, although biologically realistic values have been used, there has been no attempt to fit these to experimental data or perform bifurcation analysis. The results of simulations should therefore be treated as qualitative and absolute concentration measurements used for quantitative predictions only after using biological knowledge or data to refine parameter values, which will almost certainly increase overall predictive power of the model. In particular, it is known that modifying parameters $V_{1}^{max},V_{2}^{max},V_{3}^{max}$ or introducing a cost of each pathway (i.e., weights for $u_{1},u_{2},u_{3}$ in the summation constraint from (14)) generates a static model of resource allocation for the network in Example 1 where either pure oxidation, pure fermentation, or a mixture of oxidation and fermentation, are optimal strategies for maximal ATP production (see [51] and references therein). This result is based on the proposal that overflow metabolism is a consequence of lower yield- higher rate pathways being preferred when certain environmental conditions are combined with the constraints of resource allocation [66]. By contrast, simulations based on the parameter values in Table 1 are only intended to capture isolated effects of the maximum entropy control law without imposing any additional biological information.

Figures 4(a)-4(c) show batch culture simulations for increasing values of $\sigma$ .

When $\sigma$ is small (Figure 4(a)), there is relatively little accumulation of ethanol or storage carbohydrate over the course of simulation because, for the given parameter values at the provided oxygen concentrations, the glucose oxidation pathway (EFM represented by $\mathbf{Z}^{3}$ ) continues to receive the majority of resource until glucose is depleted and growth halted. The fraction of resource allocated to glucose fermentation (EFM represented by $\mathbf{Z}^{1}$ ) only becomes comparable towards the end of the simulation. When $\sigma$ is increased to an intermediate value however (Figure 4(b)), there is a large accumulation of extracellular ethanol, which in turn feeds in to the ethanol oxidation pathway (EFM represented by $\mathbf{Z}^{2}$ ) that receives the majority of resource during the latter half of the simulation. This transition from glucose metabolism to ethanol oxidation qualitatively captures the Crabtree effect observed in batch cultures of baker’s or brewer’s yeast Saccharomyces cerevisiae [67], but note that this effect could have been reproduced at lower values of $\sigma$ should $V_{1}^{max}>>V_{3}^{max}$ , $K_{1}>>K_{3}$ , or the cost of the oxidative pathways (weights for $u_{2},u_{3}$ in (14)) be increased in analogy with the model of overflow metabolism presented in [51]. Instead, for the parameter values reported here, ethanol accumulation and the subsequent transition between substrates is solely attributable to the maximum entropy control law, which allocates non-zero fractions of resource to EFMs with lower return-on-investment. This behaviour is indicative of a bet-hedging component of the Crabtree effect, possibly related to that observed experimentally in the case of the diauxic shift [18]. When $\sigma$ is increased further (Figure 4(c)), there is a considerable accumulation of storage carbohydrate that leads to the pathways involved in its consumption (EFMs represented by $\mathbf{Z}^{4}$ and $\mathbf{Z}^{5}$ ) receiving the majority of resource for a brief window of time at the point of transition from glucose to ethanol. Accumulation of storage carbohydrate occurs because a greater fraction of resource is allocated to the EFM represented by $\mathbf{Z}^{6}$ , even though this pathway does not contribute directly to growth. Closer inspection reveals a sharp increase in the rate of storage carbohydrate accumulation immediately prior to the onset of the transition from glucose to ethanol and its subsequent utilisation, which agrees very well with the observed dynamics of glycogen and trehalose metabolism in S. cerevisiae batch culture: these metabolites rapidly accumulate near the end of the growth phase on glucose and are quickly consumed when consumption of ethanol begins [58, 59, 68, 69, 70]. Figure 4(d) plots steady state concentrations of extracellular ethanol against dilution rate for simulations of the model in continuous culture with $\sigma=1.0$ . As a general trend, increasing dilution rate leads to an increase in extracellular ethanol concentrations before reaching a maximum value and decreasing prior to wash out (when dilution rate exceeds growth rate). In the future, these continuous culture simulations could be compared with data reported in [71] for further refinement of parameter values.

For comparison with other models of yeast metabolism at this level of complexity, two closely related examples are the model derived using EFM lumping by Song and Ramkrishna in [39] and the model by Jones and Kompala [72]. Both are based on cybernetic laws originating from the greedy control of Young and Ramkrishna [7], but unlike the maximum entropy model presented here have been enlarged to include auxiliary variables (called ‘cybernetic enzymes’) responsible for conferring the regulatory effects of control. Cybernetic enzymes may confer additional robustness to dynamic models, but also imply the existence of extra parameter values that are especially difficult to determine experimentally because they do not correspond to biological reality. One could argue that the EFM control variables $u_{k}$ do not correspond directly to biological entities either, but in this case they come without the overhead of additional uninterpretable parameters. Although based on EFM families, the model of Song and Ramkrishna [39] does not include the dynamics of storage carbohydrates. In any case, their cybernetic control law would negate the allocation of resource to any EFMs involved in storage carbohydrate formation unless additional assumptions were imposed upon the model. Parameters used in their simulations were only partially informed by experimental data. On the other hand, the model of Jones and Kompala [72] does include a term for storage carbohydrates, but the coupling of $C$ to other variables in the model (structurally equivalent to that of Song and Ramkrishna) is based entirely on empirical observations [58, 59, 68, 69, 70] for the dynamics of carbohydrate accumulation and utilisation. Conversely, the description of this phenomenon presented here relies on nothing more than the maximum entropy principle. As a final note, parameter values in the Jones and Kompala model were obtained by direct fit to experimental data. Such an approach could be expected to improve the quantitate behaviour of the maximum entropy model and the lumped cybernetic model in [39], both to be viewed as qualitatively predictive in nature.

7 Conclusion

Using the maximum entropy principle to extend the optimal control framework of Young and Ramkrishna [7] provides a dynamic theory of metabolic resource allocation in the face of environmental uncertainty. This concept both generalises and unifies prior approaches by establishing a smooth interpolation between DFBA [4] at one extreme, and dynamic metabolic models without regulation [5, 6] at the other. In contrast to alternative optimal control laws, no assumption other than instantaneous maximisation of total catalytic biomass based on the maximum entropy principle is required to explain activation of pathways not contributing directly to growth rate, such as the formation of reserve compounds that have previously been explicitly included as an integral component of total biomass. From an evolutionary perspective this is a particularly appealing explanation for the observed accumulation of reserve compounds in growth-limiting conditions, because selection for maximal rates of self-replication extends beyond cellular biology to the RNA world [73]. However, it likely that this form of bet-hedging constitutes just one component of the biological mechanism that governs dynamic regulation of metabolism in fluctuating environmental conditions.

Application of the dynamic maximum entropy framework to a simplified model of yeast metabolism has shown that the theory successfully reproduces some observed behaviour of cell populations in batch and continuous culture. This reduced model can almost surely be improved by including additional biological knowledge on the nature of overflow metabolism in yeast [51, 66, 67], but also serves to illustrate contributions that come from bet-hedging alone. When considering cell populations, there are at least two (not mutually exclusive) theoretical interpretations for such bet-hedging mechanisms [23, 24]: that the maximum entropy distribution of resources is a result of heterogeneity in the regulatory FBA/Bang-Bang strategies of individuals [19, 20, 31, 32], or is chosen by each cell as the optimal strategy for dealing with uncertainty in the environment [18, 21]. It is important to note that, if thermodynamic constraints are imposed, then only the FBA/Bang-Bang policy will be guaranteed to yield a thermodynamically-consistent, single-cell resource allocation strategy provided the set of EFMs is restricted to those that are thermodynamically-feasible [49]. This observation therefore promotes the interpretation that each individual in the population adopts an FBA/Bang-Bang policy [74], and that the relative fraction of total resource allocated to an EFM reflects the fraction of individuals within the population investing exclusively in that metabolic pathway. Due to the overall scaling effect of total catalytic biomass concentration $x$ in the effective return-on-investment, the concentration of resource on the EFM with largest return-on-investment will increase as $x$ does also, and therefore the maximum entropy resource allocation strategy of the entire population also approaches the FBA/Bang-Bang policy as the population grows in size. This leads to further concentration of resource on the pathway with greatest contribution to growth, whereas, when the population is growth-limited, resource allocation based on the temporal maximum entropy control is increased to pathways with higher total metabolite yield.

Entropy is the uniquely-defined continuous function of a discrete probability distribution, monotonically increasing in the number of states when each occurs with equal probability, which respects the composition law of fractional partitioning [25, 26]. For applications to dynamic resource allocation this naturally identifies the maximum entropy distribution as the appropriate control law for performing model reduction based on EFM families, where several alternative EFM weightings have been introduced previously [39, 40, 41]. The suggested EFM weighting based on the maximum entropy control law generalises those considered in prior work. Using EFM families becomes particularly important for larger metabolic networks where an explosion in the number of EFMs [42] makes direct model parameterisation infeasible; however, the maximum entropy framework provides a consistent methodology for recursive model reduction. It is of worth pointing out that DeVilbiss and Ramkrishna have recently proposed an information theory-based model selection scheme [75], using as a test case metabolic models expressed in terms of EFMs and the control laws of Young and Ramkrishna. Models based on EFM families were shown to provide the most succinct description of steady state fluxes as measured by information-theoretic criteria, and it would therefore be intriguing to explore how these criteria synergise with the information-theoretic concept of the maximum entropy control.

Acknowledgements

This work benefited from advice from JD Young, W Liebermeister, and P Dixit on metabolic modelling, and discussions with JS O’Neill and HC Causton on yeast metabolism. Thanks are extended to JD Young for also sharing their PhD Thesis, and to D Foley and M Dean for highlighting relevance of the maximum entropy principle in behavioural economics. DS Tourigny is a Simons Foundation Fellow of the Life Sciences Research Foundation.

Appendix A Derivation of maximum entropy control

To solve the optimal control problem (14) one introduces the Hamiltonian

[TABLE]

where $\boldsymbol{\lambda}$ is a co-state vector the same dimension as $\mathbf{X}$ . Applying Pontryagin’s maximum principle implies maximisation of $\mathcal{H}$ , or equivalently the functional

[TABLE]

with respect to $\mathbf{u}$ subject to the constraint

[TABLE]

This results in the $K$ first-order conditions

[TABLE]

where $\alpha\geq 0$ is a Lagrange multiplier and $\mathbf{B}^{k}$ is the $k$ th column of $\mathbf{B}$ . The general solution of these equations takes the form

[TABLE]

and $Q$ is determined from the above constraint such that

[TABLE]

The value of the co-state vector $\boldsymbol{\lambda}$ is obtained as in [7] by solving the boundary value problem

[TABLE]

whose solution is

[TABLE]

This expression for $\boldsymbol{\lambda}$ is substituted into the above expression for $u_{k}$ and because, ultimately, only the optimal control input at the current time $t$ is of interest, one can set $\tau=0$ as in [7], which yields the maximum entropy control (15).

Appendix B First-order correction to return-on-investment

Expressed in terms of the average zeroth-order return-on-investment

[TABLE]

the vector $\mathbf{q}^{T}\mathbf{A}$ is

[TABLE]

The first-order expansion of $\mathbf{e}^{\mathbf{A}\Delta t}$ takes the form $\mathbf{I}+\Delta t\mathbf{A}$ , where $\mathbf{I}$ is the identity matrix, and therefore the effective return-on-investment $\mathcal{R}^{k}_{\Delta t}(\mathbf{m}_{s})$ in (16) is approximated to first-order by

[TABLE]

Taking the coefficient of $x\Delta t$ and substituting for $\mathbf{q}^{T}\mathbf{A}$ gives the first-order correction to return-on-investment

[TABLE]

which can be written as

[TABLE]

using the definition of $Y_{k}(\mathbf{m}_{s})$ provided in (20).

Appendix C Expressing $\mathcal{F}(\mathbf{u})$ in terms of EFM families

Using (17) to substitute for $\mathbf{B}^{k}$ in the effective return-on-investment (16) means that $\mathcal{F}(\mathbf{u})$ takes the form

[TABLE]

The sum over $k$ in the first term can be written as

[TABLE]

where the second equality follows from the identity $\tilde{u}_{j}=u_{j}/U_{J}$ for $j\in F_{J}$ , which implies

[TABLE]

Using the composition property (24) of entropy $H(\mathbf{u})$ , one has

[TABLE]

which is (25) with

[TABLE]

the objective functional $\mathcal{F}$ restricted to the $J$ th family. Notice that the weighting (22) defines the effective return-on-investment for the $J$ th family derived directly from system (9) to be

[TABLE]

and therefore $\mathcal{F}_{J}(\mathbf{\tilde{u}}_{J})$ includes an entropic correction that is only zero when $\tilde{u}_{J}$ describes an FBA/Bang-Bang policy.

Bibliography75

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Varma A, Palsson, BØ (1994) Metabolic flux balancing: basic concepts, scientific and practical use. Nat. Biotechnol. 12: 994-998.
2[2] Orth JD, Thiele I, Palsson, BØ (2010) What is flux balance analysis? Nat. Biotechnol. 28: 245-248.
3[3] Schuster S, Hilgetag C (1994) On elementary flux modes in biochemical reaction systems at steady state. J. Biol. Syst. 2: 165-182.
4[4] Mahadevan R, Edwards JS, Doyle FJ 3rd (2002) Dynamic flux balance analysis of diauxic growth in Escherichia coli. Biophys. J. 83: 1331-1340.
5[5] Provost A, Bastin, G (2004) Dynamic metabolic modelling under the balanced growth condition. J. Process Control 14: 717-728.
6[6] Provost A, Bastin G, Agathos SN, Schneider YJ (2006) Metabolic design of macroscopic bioreaction models: Application to Chinese hamster ovary cells. Bioprocess Biosyst. Eng. 29: 349-366.
7[7] Young JD, Ramkrishna D (2007) On the matching and proportional laws of cybernetic models. Biotechnol. Prog. 23: 83-99.
8[8] Young JD, Henne KL, Morgan JA, Konopka AE, Ramkrishna D (2008) Integrating cybernetic modeling with pathway analysis provides a dynamic, systems-level description of metabolic control. Biotechnol. Bioeng. 100: 542-559.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Dynamic metabolic resource allocation based on the maximum entropy principle

Abstract

1 Introduction

2 Dynamic model of metabolism

Example 1**.**

3 Maximum entropy control

Example 2**.**

4 Metabolite yields and anticipatory regulation

Example 3**.**

5 Model reduction using EFM families

Example 4**.**

6 Application to yeast metabolism

7 Conclusion

Acknowledgements

Appendix A Derivation of maximum entropy control

Appendix B First-order correction to return-on-investment

Appendix C Expressing F(u)\mathcal{F}(\mathbf{u})F(u) in terms of EFM families

Example 1.

Example 2.

Example 3.

Example 4.

Appendix C Expressing $\mathcal{F}(\mathbf{u})$ in terms of EFM families