Stochastic and Information-thermodynamic Structures of Population   Dynamics in Fluctuating Environment

Tetsuya J. Kobayashi; Yuki Sughiyama

arXiv:1703.00125·q-bio.PE·July 12, 2017

Stochastic and Information-thermodynamic Structures of Population Dynamics in Fluctuating Environment

Tetsuya J. Kobayashi, Yuki Sughiyama

PDF

TL;DR

This paper explores the fundamental stochastic and thermodynamic principles underlying how living systems adapt to fluctuating environments, revealing limits of fitness gain and conditions for optimal adaptation through derived fluctuation relations.

Contribution

It introduces causal fluctuation relations linking fitness and information, generalizes the evolutionary stable state concept, and clarifies the thermodynamic limits of adaptation in fluctuating environments.

Findings

01

Derived causal fluctuation relations of fitness and information.

02

Identified the limit of fitness gain and conditions for excess fitness.

03

Generalized the concept of evolutionary stable state for fluctuating environments.

Abstract

Adaptation in a fluctuating environment is a process of fueling environmental information to gain fitness. Living systems have gradually developed strategies for adaptation from random and passive diversification of the phenotype to more proactive decision making, in which environmental information is sensed and exploited more actively and effectively. Understanding the fundamental relation between fitness and information is therefore crucial to clarify the limits and universal properties of adaptation. In this work, we elucidate the underlying stochastic and information-thermodynamic structure in this process, by deriving causal fluctuation relations (FRs) of fitness and information. Combined with a duality between phenotypic and environmental dynamics, the FRs reveal the limit of fitness gain, the relation of time reversibility with the achievability of the limit, and the possibility…

Equations154

e^{k (x, y)} = e^{k_{ma x} (y)} T_{K} (y ∣ x),

e^{k (x, y)} = e^{k_{ma x} (y)} T_{K} (y ∣ x),

N_{t + 1}^{\mathpzc Y, \mathpzc Z} (x_{t + 1}) = e^{k (x_{t + 1}, y_{t + 1})} x_{t} \in S_{x} \sum T_{F} (x_{t + 1} ∣ x_{t}, z_{t + 1}) N_{t}^{\mathpzc Y, \mathpzc Z} (x_{t}) .

N_{t + 1}^{\mathpzc Y, \mathpzc Z} (x_{t + 1}) = e^{k (x_{t + 1}, y_{t + 1})} x_{t} \in S_{x} \sum T_{F} (x_{t + 1} ∣ x_{t}, z_{t + 1}) N_{t}^{\mathpzc Y, \mathpzc Z} (x_{t}) .

Ψ^{s} [\mathpzc Y_{t}, \mathpzc Z_{t}] := ln \frac{\sum _{x_{t}} N _{t}^{\mathpzc Y, \mathpzc Z} ( x _{t} )}{\sum _{x_{0}} N _{0}^{\mathpzc Y, \mathpzc Z} ( x _{0} )} = ln \frac{N _{t}^{\mathpzc Y, \mathpzc Z}}{N _{0}^{\mathpzc Y, \mathpzc Z}} .

Ψ^{s} [\mathpzc Y_{t}, \mathpzc Z_{t}] := ln \frac{\sum _{x_{t}} N _{t}^{\mathpzc Y, \mathpzc Z} ( x _{t} )}{\sum _{x_{0}} N _{0}^{\mathpzc Y, \mathpzc Z} ( x _{0} )} = ln \frac{N _{t}^{\mathpzc Y, \mathpzc Z}}{N _{0}^{\mathpzc Y, \mathpzc Z}} .

⟨ Ψ_{t}^{s} ⟩_{Q} := ⟨ Ψ^{s} [\mathpzc Y_{t}, \mathpzc Z_{t}] ⟩_{Q [\mathpzc Y_{t}, \mathpzc Z_{t}]} .

⟨ Ψ_{t}^{s} ⟩_{Q} := ⟨ Ψ^{s} [\mathpzc Y_{t}, \mathpzc Z_{t}] ⟩_{Q [\mathpzc Y_{t}, \mathpzc Z_{t}]} .

P_{F} [\mathpzc X_{t} ∣∣ \mathpzc Z_{t}]

P_{F} [\mathpzc X_{t} ∣∣ \mathpzc Z_{t}]

K [\mathpzc X_{t}, \mathpzc Y_{t}] := τ = 0 \sum t - 1 k (x_{τ + 1}, y_{τ + 1}) .

K [\mathpzc X_{t}, \mathpzc Y_{t}] := τ = 0 \sum t - 1 k (x_{τ + 1}, y_{τ + 1}) .

N_{t}^{\mathpzc Y, \mathpzc Z} [\mathpzc X_{t}] = e^{K [\mathpzc X_{t}, \mathpzc Y_{t}]} P_{F} [\mathpzc X_{t} ∣∣ \mathpzc Z_{t}] N_{0}^{\mathpzc Y, \mathpzc Z} .

N_{t}^{\mathpzc Y, \mathpzc Z} [\mathpzc X_{t}] = e^{K [\mathpzc X_{t}, \mathpzc Y_{t}]} P_{F} [\mathpzc X_{t} ∣∣ \mathpzc Z_{t}] N_{0}^{\mathpzc Y, \mathpzc Z} .

Ψ^{s} [\mathpzc Y_{t}, \mathpzc Z_{t}]

Ψ^{s} [\mathpzc Y_{t}, \mathpzc Z_{t}]

Ψ^{s} [\mathpzc Y_{t}, \mathpzc Z_{t}]

Ψ^{s} [\mathpzc Y_{t}, \mathpzc Z_{t}]

P_{B}^{s} [\mathpzc X_{t} ∣ \mathpzc Y_{t}, \mathpzc Z_{t}]

P_{B}^{s} [\mathpzc X_{t} ∣ \mathpzc Y_{t}, \mathpzc Z_{t}]

P_{B}^{b} [\mathpzc X_{t} ∣ \mathpzc Y_{t}]

P_{B}^{b} [\mathpzc X_{t} ∣ \mathpzc Y_{t}]

K [\mathpzc X_{t}, \mathpzc Y_{t}] = Φ_{0} + ln \frac{P _{K} [ \mathpzc Y _{t} ∣∣ \mathpzc X _{t} ]}{Q _{0} [ \mathpzc Y _{t} ]},

K [\mathpzc X_{t}, \mathpzc Y_{t}] = Φ_{0} + ln \frac{P _{K} [ \mathpzc Y _{t} ∣∣ \mathpzc X _{t} ]}{Q _{0} [ \mathpzc Y _{t} ]},

K [\mathpzc X_{t}, \mathpzc Y_{t}] = K_{m a x} [\mathpzc Y_{t}] + ln P_{K} [\mathpzc Y_{t} ∣∣ \mathpzc X_{t}] .

K [\mathpzc X_{t}, \mathpzc Y_{t}] = K_{m a x} [\mathpzc Y_{t}] + ln P_{K} [\mathpzc Y_{t} ∣∣ \mathpzc X_{t}] .

Ψ^{b} [\mathpzc Y_{t}] = Φ_{0} - ln Q_{0} [\mathpzc Y_{t}] - ln \frac{P _{B}^{b} [ \mathpzc X _{t} ∣ \mathpzc Y _{t} ]}{P _{K} [ \mathpzc Y _{t} ∣∣ \mathpzc X _{t} ] P _{F} [ \mathpzc X _{t} ]} .

Ψ^{b} [\mathpzc Y_{t}] = Φ_{0} - ln Q_{0} [\mathpzc Y_{t}] - ln \frac{P _{B}^{b} [ \mathpzc X _{t} ∣ \mathpzc Y _{t} ]}{P _{K} [ \mathpzc Y _{t} ∣∣ \mathpzc X _{t} ] P _{F} [ \mathpzc X _{t} ]} .

Ψ^{b} [\mathpzc Y_{t}] = Ψ_{0} [\mathpzc Y_{t}] - ln \frac{P _{B}^{b} [ \mathpzc X _{t} , \mathpzc Y _{t} ]}{P _{K} [ \mathpzc Y _{t} ∣∣ \mathpzc X _{t} ] P _{F} [ \mathpzc X _{t} ]},

Ψ^{b} [\mathpzc Y_{t}] = Ψ_{0} [\mathpzc Y_{t}] - ln \frac{P _{B}^{b} [ \mathpzc X _{t} , \mathpzc Y _{t} ]}{P _{K} [ \mathpzc Y _{t} ∣∣ \mathpzc X _{t} ] P _{F} [ \mathpzc X _{t} ]},

Ψ_{0} [\mathpzc Y_{t}] := Φ_{0} + ln \frac{Q [ \mathpzc Y _{t} ]}{Q _{0} [ \mathpzc Y _{t} ]} = K_{m a x} [\mathpzc Y_{t}] + ln Q [\mathpzc Y_{t}] .

Ψ_{0} [\mathpzc Y_{t}] := Φ_{0} + ln \frac{Q [ \mathpzc Y _{t} ]}{Q _{0} [ \mathpzc Y _{t} ]} = K_{m a x} [\mathpzc Y_{t}] + ln Q [\mathpzc Y_{t}] .

Ψ^{s} [\mathpzc Y_{t}, \mathpzc Z_{t}] = Ψ_{0} [\mathpzc Y_{t}] - ln \frac{P _{B}^{s} [ \mathpzc X _{t} , \mathpzc Y _{t} , \mathpzc Z _{t} ]}{P _{K} [ \mathpzc Y _{t} ∣∣ \mathpzc X _{t} ] P _{F} [ \mathpzc X _{t} ∣∣ \mathpzc Z _{t} ] Q [ \mathpzc Z _{t} ∣ \mathpzc Y _{t} ]}

Ψ^{s} [\mathpzc Y_{t}, \mathpzc Z_{t}] = Ψ_{0} [\mathpzc Y_{t}] - ln \frac{P _{B}^{s} [ \mathpzc X _{t} , \mathpzc Y _{t} , \mathpzc Z _{t} ]}{P _{K} [ \mathpzc Y _{t} ∣∣ \mathpzc X _{t} ] P _{F} [ \mathpzc X _{t} ∣∣ \mathpzc Z _{t} ] Q [ \mathpzc Z _{t} ∣ \mathpzc Y _{t} ]}

P_{B}^{b} [\mathpzc X_{t} ∣ \mathpzc Y_{t}]

P_{B}^{b} [\mathpzc X_{t} ∣ \mathpzc Y_{t}]

P_{B}^{s} [\mathpzc X_{t} ∣ \mathpzc Y_{t}, \mathpzc Z_{t}]

P_{K, F} [\mathpzc Y_{t}]

P_{K, F} [\mathpzc Y_{t}]

P_{K, F} [\mathpzc Y_{t} ∣ \mathpzc Z_{t}]

e^{- (Ψ_{0} [\mathpzc Y_{t}] - Ψ^{b} [\mathpzc Y_{t}])} = \frac{P _{K} [ \mathpzc Y _{t} ∣∣ \mathpzc X _{t} ] P _{F} [ \mathpzc X _{t} ]}{P _{B}^{b} [ \mathpzc X _{t} , \mathpzc Y _{t} ]} = \frac{P _{K, F} [ \mathpzc Y _{t} ]}{Q [ \mathpzc Y _{t} ]},

e^{- (Ψ_{0} [\mathpzc Y_{t}] - Ψ^{b} [\mathpzc Y_{t}])} = \frac{P _{K} [ \mathpzc Y _{t} ∣∣ \mathpzc X _{t} ] P _{F} [ \mathpzc X _{t} ]}{P _{B}^{b} [ \mathpzc X _{t} , \mathpzc Y _{t} ]} = \frac{P _{K, F} [ \mathpzc Y _{t} ]}{Q [ \mathpzc Y _{t} ]},

⟨ e^{- (Ψ_{0} [\mathpzc Y_{t}] - Ψ^{b} [\mathpzc Y_{t}])} ⟩_{Q [\mathpzc Y_{t}]} = 1.

⟨ e^{- (Ψ_{0} [\mathpzc Y_{t}] - Ψ^{b} [\mathpzc Y_{t}])} ⟩_{Q [\mathpzc Y_{t}]} = 1.

⟨ Ψ^{b} ⟩_{Q} = ⟨ Ψ_{0} ⟩_{Q} - D_{loss}^{b},

⟨ Ψ^{b} ⟩_{Q} = ⟨ Ψ_{0} ⟩_{Q} - D_{loss}^{b},

D_{loss}^{b}

D_{loss}^{b}

\displaystyle=\mathcal{D}\Big{[}\mathbb{Q}[\mathpzc{Y}_{t}]\Big{|}\Big{|}\mathbb{P}_{K,F}[\mathpzc{Y}_{t}]\Big{]}

⟨ Ψ_{0} ⟩_{Q} \geq {T_{F}, T_{K}} max ⟨ Ψ^{b} ⟩_{Q},

⟨ Ψ_{0} ⟩_{Q} \geq {T_{F}, T_{K}} max ⟨ Ψ^{b} ⟩_{Q},

\displaystyle\left<\Psi_{0}\right>_{\mathbb{Q}}=\left<K_{\max}\right>_{\mathbb{Q}}-\mathcal{S}[\mathbb{Q}]=\Phi_{0}+\mathcal{D}\Big{[}\mathbb{Q}\Big{|}\Big{|}\mathbb{Q}_{0}\Big{]},

\displaystyle\left<\Psi_{0}\right>_{\mathbb{Q}}=\left<K_{\max}\right>_{\mathbb{Q}}-\mathcal{S}[\mathbb{Q}]=\Phi_{0}+\mathcal{D}\Big{[}\mathbb{Q}\Big{|}\Big{|}\mathbb{Q}_{0}\Big{]},

Φ_{0} = Q min ⟨ Ψ_{0} [\mathpzc Y_{t}] ⟩_{Q [\mathpzc Y_{t}]} \geq Q min {T_{F}, T_{K}} max ⟨ Ψ^{b} [\mathpzc Y_{t}] ⟩_{Q [\mathpzc Y_{t}]},

Φ_{0} = Q min ⟨ Ψ_{0} [\mathpzc Y_{t}] ⟩_{Q [\mathpzc Y_{t}]} \geq Q min {T_{F}, T_{K}} max ⟨ Ψ^{b} [\mathpzc Y_{t}] ⟩_{Q [\mathpzc Y_{t}]},

Q min ⟨ Ψ_{0} [\mathpzc Y_{t}] ⟩_{Q [\mathpzc Y_{t}]} = Φ_{0} + Q min D [Q ∣∣ Q_{0}] = Φ_{0} .

Q min ⟨ Ψ_{0} [\mathpzc Y_{t}] ⟩_{Q [\mathpzc Y_{t}]} = Φ_{0} + Q min D [Q ∣∣ Q_{0}] = Φ_{0} .

{T_{F}^{†}, T_{K}^{†}}

{T_{F}^{†}, T_{K}^{†}}

⟨ Ψ_{0} ⟩_{Q^{†}} = ⟨ Ψ^{b} [\mathpzc Y_{t}] ⟩_{Q^{†} [\mathpzc Y_{t}]} \geq {T_{F}^{'}, T_{K}^{'}} max ⟨ Ψ^{b}^{'} [\mathpzc Y_{t}] ⟩_{Q^{†} [\mathpzc Y_{t}]},

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Stochastic and Information-thermodynamic Structures of Population Dynamics in Fluctuating Environment

Tetsuya J. Kobayashi

[email protected] http://research.crmind.net Institute of Industrial Science, the University of Tokyo.

PRESTO, JST.

Yuki Sughiyama

Institute of Industrial Science, the University of Tokyo.

Abstract

Adaptation in a fluctuating environment is a process of fueling environmental information to gain fitness. Living systems have gradually developed strategies for adaptation from random and passive diversification of the phenotype to more proactive decision making, in which environmental information is sensed and exploited more actively and effectively. Understanding the fundamental relation between fitness and information is therefore crucial to clarify the limits and universal properties of adaptation. In this work, we elucidate the underlying stochastic and information-thermodynamic structure in this process, by deriving causal fluctuation relations (FRs) of fitness and information. Combined with a duality between phenotypic and environmental dynamics, the FRs reveal the limit of fitness gain, the relation of time reversibility with the achievability of the limit, and the possibility and condition for gaining excess fitness due to environmental fluctuation. The loss of fitness due to causal constraints and the limited capacity of real organisms is shown to be the difference between time-forward and time-backward path probabilities of phenotypic and environmental dynamics. Furthermore, the FRs generalize the concept of evolutionary stable state (ESS) for fluctuating environment by giving the probability that the optimal strategy on average can be invaded by a suboptimal one owing to rare environmental fluctuation. These results clarify the information thermodynamic structures in adaptation and evolution.

Fluctuation theorem; Evolution; Decision making; Bet-hedging;Fitness; Variational structure;

pacs:

Valid PACS appear here

††preprint: AIP/123-QED

I Introduction

I.1 Adaptation in fluctuating environment

Adaptation is fundamental to all organisms for their survival and evolutionary success in a changing environment. In the course of evolution, living systems have gradually attained and developed more active and efficient strategies for adaptation, which generally accompany more effective use of environmental information. Understanding how the use of information is linked to the efficiency of adaptation is crucial to clarify the fundamental limits and universal properties of biological adaptationsChevin, Lande, and Mace (2010); Frank (2011a).

The most primitive strategy for adaptation is to randomly generate genetic and phenotypic heterogeneity in a population Slatkin (1974); Philippi and Seger (1989); Balaban et al. (2004); Wakamoto et al. (2013). Provided that a sufficiently large heterogeneity is constantly generated in the population, a fraction of organisms can, by chance, have the types adaptive to the upcoming environmental state and circumvent extinction of the population at the cost of others with non-adaptive typesCohen (1966, 1971). Such a strategy is known as bet-hedging or phenotypic diversification and works even if the organisms are completely blind to the environment, without any a priori knowledge of its dynamics. The bet-hedging is a passive and a posteriori adaptation in the sense that the adaptation is achieved extrinsically by and after the impact of environmental selectionKobayashi and Sughiyama (2015). The evolutionary advantage of the bet-hedging strategy is demonstrated by the persistence of bacteria, pathogens, and cancer cells to antibiotic or anticancer drug treatmentsBalaban et al. (2004); Dhar and McKinney (2007); Beaumont et al. (2009); Meacham and Morrison (2013); Wakamoto et al. (2013). The gain of fitness by bet-hedging can be optimized if the population evolves to generate an appropriate pattern of heterogeneity by learning the environmental statisticsXue and Leibler (2016). Nevertheless, the gain of fitness by bet-hedging is fundamentally limited because of the passive and a posteriori nature of the strategy, in which the individual organisms have no access to the microscopic information of which environmental states will actually be realized.

With any access to such information, the loss can be avoided further by decision making: directly sensing the current environmental state, predicting the upcoming state, and switching into the phenotypic state that is adaptive to that state Perkins and Swain (2009); Ben-Jacob and Schultz (2010); Kobayashi and Kamimura (2012); Brennan, Cheong, and Levchenko (2012). The strategy of adaptation via sensing is active and a priori in a sense that adaptation is intrinsically achieved by the predictive actions of the organismsKobayashi and Sughiyama (2015). In biologically relevant situations, both passive and active aspects of adaptation are intermingled because perfect sensing and prediction of environment are impossible with the limited capacities of biological systems.

I.2 Notions of information and analogy with physics in biological adaptation

At an analogical level, the problem of the fundamental law and the limits of adaptation and evolution shares several aspects with physics, especially with thermodynamics, which drove the long-lasting attempts to establish the thermodynamics of biological adaptation and evolution Lotka (1922a, b); Iwasa (1988); de Vladar and Barton (2011); Frank (2012a); Qian (2014); Mustonen and Lässig (2010). Among other areas, the fundamental limit of fitness in a changing environment and the value of environmental information have been a major focus in evolutionary biology Levins (1965, 1968); Cohen (1966); Chevin, Lande, and Mace (2010); Frank (2011a); Rivoire (2015). Haccou and Iwasa may be the first who linked, albeit implicitly, environmental information with the gain of fitness in a stochastic environmentHaccou and Iwasa (1995). Bergstrom and Lachmann pursued the fitness value of information by directly incorporating mutual informationBer (2004); Donaldson-Matasci, Bergstrom, and Lachmann (2010); Cover and Thomas (2012). Others also pointed out some quantitative relations between fitness and information measures such as relative entropy and Jeffreys’ divergenceKussell and Leibler (2005); Frank (2012b); Pugatch, Barkai, and Tlusty (2013). More recently, Rivoire and Liebler conducted a comprehensive analysis by employing an analogy between bet-hedging of organisms and horse race gambling Rivoire and Leibler (2011), the link of which to information theory was revealed in the seminal work by Kelly in 1956 Kelly (1956). However, all previous works either imposed certain restrictions on their models to derive the information-theoretic measures of fitness value Kussell and Leibler (2005); Rivoire and Leibler (2011, 2014) or had to introduce phenomenological measures for the value of information to accommodate more general situationsRivoire and Leibler (2011, 2014), because they lacked an appropriate method to handle the mixture of the passive and active aspects in adaptation.

We recently resolved this problem Kobayashi and Sughiyama (2015) by combining a path integral formulation of population dynamics Leibler and Kussell (2010); Bianconi and Rahmede (2012); Wakamoto, Grosberg, and Kussell (2012); Sughiyama et al. (2015), a retrospective characterization of the selected population Baake and Georgii (2006); Wakamoto, Grosberg, and Kussell (2012); Lambert and Kussell (2015), and a variational structure in population dynamics Demetrius and Gundlach (2014); Sughiyama et al. (2015). The results we obtained generalized the limits of fitness gain by sensing and revealed that the gain satisfies fluctuation relations (FRs) that fundamentally constrain not only its average but also its fluctuation. These relations, alongside a previous work in the line of neutral theory Mustonen and Lässig (2010), imply that fitness in the fluctuation environment shares, at least mathematically, similar structures to those of stochastic and information thermodynamicsSeifert (2012); Sagawa (2012). In our FRs, the fluctuation of fitness of a given population is evaluated by the difference from the fitness that achieves the maximum average fitness over all possible phenotypic histories of organisms. Conceptually, this means that we postulate a Darwinian demon, an imaginary organism, that can exhibit any type of behavior without imposing any constraint not only on biological capacity but also on the causality of dynamics. The FRs characterize the loss of fitness of a realistic organism from such an idealized organism. Thus, understanding the properties of the Darwinian demon and the deviation from it by a realistic organism are central to a deeper understanding of the behavior of populations in a changing environment. However, the implicit definition of the demon as the maximizer of the average fitness hampers the explicit characterization of the demon and obscures the formal link to stochastic thermodynamics , in which a variational characterization is not commonSeifert (2012); Sagawa (2012). More practically, without an explicit characterization, we are unable to simulate possible behaviors of the demon even numerically.

I.3 Outline of main results

In this paper, we resolve these problems by deriving FRs of fitness without using the variational approach. To this end, we first formulate and generalize the problem of adaptation in a changing environment so that individual organisms can change not only their strategy of switching phenotypic states but also the strategy of allocating metabolic resources to each phenotypic state (Sec. II). Combined with the path integral formulation of population dynamics, this generalization enables us to obtain a decomposition of fitness with a combination of time-forward (chronological) and time-backward (retrospective) path probabilities (Sec. III). The decomposition naturally spells out an explicit representation of the upper bound of the average fitness, which was implicitly defined in our previous workKobayashi and Sughiyama (2015).

For the bet-hedging problem without a sensing environment (Sec. IV), the decomposition directly leads to FRs of the fitness loss, which has a very similar form as the FRs of entropy production in stochastic thermodynamicsSeifert (2012). After numerically verifying the derived FRs (Sec. IV.2), we investigate the biological meanings and achievability of the FRs (Sec. V). The average FR is related to the evolutionary stable state (ESS) , under which the strategy with maximal average fitness cannot be invaded by any other strategies. The detailed and integral FRs generalize the ESS by giving the probability that a suboptimal strategy outperforms the optimal one within a finite time interval owing to rare fluctuation of the environmentKing and Masel (2007). By using a dualistic relation between phenotypic and environmental dynamics, the detailed FR is shown to be represented as the ratio of the path probability of the actual environment and that of the conjugate environment under which a given strategy of the organisms becomes optimal. The duality also clarifies that the average loss of fitness is directly related to the imperfectness of the adaptive behavior of the organisms, originating both from physical constraints and from the suboptimality of the behaviors.

The introduction of a sensing signal extends the FRs to accommodate the mutual information between the environment and the signal as the gain of fitness by sensing, in the same manner as mutual information bounds the negative gain of entropy production in information thermodynamicsSagawa (2012) (Sec. VI). Although the extended FRs cover very general situations, the FRs are not tight and therefore the mutual information overestimate the value of fitness by sensing. By explicitly assuming a causal relation between the environment and the signal, the FRs are further modified to involve the directed information as a tighter bound of fitness gain (Sec. VII). This modification clarifies how the loss of fitness from the upper bound is related to the causality, the inaccessibility to perfect information of the environment, and the imperfect implementation of information processing. Finally, three quantities are introduced to account for the fitness loss of inappropriate sensing and the imperfectness of metabolic allocation and phenotypic switching strategies in general situations (Sec. VII.3). The summary and future directions are described in Discussion (Sec. VIII).

II Modeling adaptation of population in changing environment

Let $x_{t}\in\mathfrak{S}_{x}$ , $y_{t}\in\mathfrak{S}_{y}$ , and $z_{t}\in\mathfrak{S}_{z}$ be the phenotype of a living organism, the state of the environment, and the state of the sensing signal at time $t$ , respectively (Fig. 1 (A)). For simplicity, possible phenotypic, environmental, and signal states are assume to be discrete as in references Donaldson-Matasci, Bergstrom, and Lachmann (2010); Rivoire and Leibler (2011); Pugatch, Barkai, and Tlusty (2013); Kobayashi and Sughiyama (2015). The paths (histories) of the states up to time $t$ are defined as $\mathpzc{X}_{t}:=\{x_{\tau}|\tau\in[0,t]\}\in\mathfrak{S}_{\mathpzc{X}}:=\mathfrak{S}_{x}^{\times(t+1)}$ , $\mathpzc{Y}_{t}:=\{y_{\tau}|\tau\in[0,t]\}\in\mathfrak{S}_{\mathpzc{Y}}:=\mathfrak{S}_{y}^{\times(t+1)}$ , and $\mathpzc{Z}_{t}:=\{z_{\tau}|\tau\in[0,t]\}\in\mathfrak{S}_{\mathpzc{Z}}:=\mathfrak{S}_{z}^{\times(t+1)}$ , respectively. Time is also treated as discrete in this work.

II.1 Modeling phenotype switching

The phenotype of an organism, in general, switches stochastically over time, depending on its past phenotypic state and the sensed signal (Fig. 1 (A) and (B)). The switching dynamics is modeled, for example, by a Markov transition probability $\mathbb{T}_{F}(x_{t+1}|x_{t},z_{t+1})$ , which satisfies $\sum_{x_{t+1}}\mathbb{T}_{F}(x_{t+1}|x_{t},z_{t+1})=1$ for all $x_{t}$ and $z_{t+1}$ . Although we mainly focus on the Markov switching, our result can be extended for causal switching $\mathbb{T}_{F}(x_{t+1}|\mathpzc{X}_{t},\mathpzc{Z}_{t+1})$ in which the next phenotypic state $x_{t}$ depends on both past phenotypic and signal histories $\mathpzc{X}_{t}$ and $\mathpzc{Z}_{t+1}$ .

II.2 Modeling metabolic resource allocation

Next, we model the strategy of metabolic resource allocation of the organisms. Each organism is assumed to duplicate asexually to produce $e^{k}$ daughter organisms on average. For each state of environment $y$ , the organisms have a maximum replication rate, $e^{k_{max}(y)}$ , that can be achieved only when the organism allocates its all metabolic resources for adapting only to that environmental state. Because the environment changes, however, the organism usually distributes its resources for different environmental conditionsGiordano et al. (2016).

To represent this situation, we introduce a conditional probability $\mathbb{T}_{K}(y|x)$ that quantitatively represents the fraction of resources allocated to the environmental state $y$ in a phenotype $x$ . An instantaneous replication rate $k(x,y)$ of the phenotype $x$ under the environmental state $y$ is then assumed to be represented as

[TABLE]

(Fig. 1 (C)). Note that such a decomposition of $e^{k(x,y)}$ can date back to Haccou and IwasaHaccou and Iwasa (1995), at least. This relation between the resource allocation strategy and the replication rate may appear to be restrictive because of the linear relation between the allocation strategy $\mathbb{T}_{K}(y|x)$ and the actual replication rate $e^{k(x,y)}$ . Nevertheless, this decomposition of $k(x,y)$ is general enough because, for a given $k(x,y)$ , we can find a pair of $e^{k_{max}(y)}$ and $\mathbb{T}_{K}(y|x)$ as long as the possible phenotypic states are fewer than those of the environment (refer to Appendix A). Such a situation is biologically plausible because the environment is usually more complex than the phenotype of an organism. Moreover, our setting includes the special situation that has been intensively investigated in previous worksKussell and Leibler (2005); Rivoire and Leibler (2011). For example, when the numbers of phenotypic and environmental states are equal as $\#\mathfrak{S}_{x}=\#\mathfrak{S}_{y}$ , $\mathbb{T}_{K}(y|x)=\delta_{x,y}$ corresponds to Kelly’s horse race gambling Kelly (1956); Rivoire and Leibler (2011), in which each phenotypic state allocates all metabolic resources to a certain environmental state and, as a result, can survive and grow only when the phenotypic state matches the realized environmental state.

II.3 Modeling sensing processes

We finally model the sensing process. We consider the case that organisms can obtain a sensing signal $z(t)$ , which correlates with the environment $y(t)$ . The sensed signal $z(t)$ is assumed to be common to all the organisms in the population (Fig. 1 (A)). A biologically relevant situation is that $z(t)$ is a vector of concentrations of extracellular signaling molecules that cannot be consumed as metabolites but correlate with the available metabolites. Another situation is that $z(t)$ is a subset of $y(t)$ to which the organisms have sensors. In either situation, the sensing noise should be negligibly small because all the organisms receive the same sensing signal $z(t)$ . Even though sensing of a common signal cannot cover all biologically realistic situations such as individual sensing with noisy receptorsPerkins and Swain (2009); Ben-Jacob and Schultz (2010); Kobayashi (2010); Kobayashi and Kamimura (2012); Brennan, Cheong, and Levchenko (2012); Barato, Hartich, and Seifert (2014), the common signal has been investigated in various works Haccou and Iwasa (1995); Rivoire and Leibler (2011); Kobayashi and Sughiyama (2015). In this work, we mainly focus on the common sensing problem and touch on the individual sensing in Discussion.

II.4 Population dynamics of organisms

By combining the phenotype switching, the metabolic allocation, and the sensing strategies, we can explicitly derive the dynamics of the population of the organisms. Because both environmental and sensing histories are external factors of the organisms, the population dynamics of the organisms is described for a given pair of environmental and signaling histories, $\mathpzc{Y}_{t}$ and $\mathpzc{Z}_{t}$ . Let $\mathcal{N}_{t}^{\mathpzc{Y},\mathpzc{Z}}(x_{t})\in\mathbb{R}_{\geq 0}$ be the number of organisms whose phenotypic state is $x_{t}$ at time $t$ under the realization of environmental and signal histories $\mathpzc{Y}_{t}$ and $\mathpzc{Z}_{t}$ . The population size of the organisms is assumed to be sufficiently large so that $\mathcal{N}_{t}^{\mathpzc{Y},\mathpzc{Z}}(x_{t})$ can be well approximated as a continuous variable. Because the large population size enable us to effectively ignore the demographic fluctuation due to finite number of the organisms, we can obtain the update ruleKobayashi and Sughiyama (2015) of $\mathcal{N}_{t}^{\mathpzc{Y},\mathpzc{Z}}(x_{t})$ as

[TABLE]

If we need to work on a population, the size of which is not sufficiently large, modeling of the population e.g., by a branching process is required. The statistical properties of the environmental and signal histories are generally characterized by a joint path probability $\mathbb{Q}[\mathpzc{Y}_{t},\mathpzc{Z}_{t}]$ .

III Path-wise formulation and fitness decomposition

By using $\mathcal{N}_{t}^{\mathpzc{Y},\mathpzc{Z}}(x_{t})$ in the previous section, we define the fitness of the population and derive its path integral formulation. The cumulative fitness $\Psi^{s}[\mathpzc{Y}_{t},\mathpzc{Z}_{t}]$ of the population at $t$ under environmental and signal histories $\mathpzc{Y}_{t}$ and $\mathpzc{Z}_{t}$ is defined by the exponential expansion of the total population size as follows:

[TABLE]

We here define $\mathcal{N}^{\mathpzc{Y},\mathpzc{Z}}_{t}:=\sum_{x_{t}}\mathcal{N}^{\mathpzc{Y},\mathpzc{Z}}_{t}(x_{t})$ . The ensemble average of the cumulative fitness for different realizations of the environmental and signal histories is represented as

[TABLE]

III.1 Path-wise and retrospective formulation

As derived in Leibler and Kussell (2010); Sughiyama et al. (2015); Kobayashi and Sughiyama (2015), the cumulative fitness at time $t$ can be represented with a path-wise (path integral) formulation. Let us first define the time-forward path probability of the phenotype switching as

[TABLE]

where $p(x_{0}):=\mathcal{N}_{0}^{\mathpzc{Y},\mathpzc{Z}}(x_{0})/\sum_{x_{0}}\mathcal{N}_{0}^{\mathpzc{Y},\mathpzc{Z}}(x_{0})$ . We here use Kramer’s causal conditioning $||$ rather than the usual conditioning $|$ in order to indicate that the path probability $\mathbb{P}_{F}[\mathpzc{X}_{t}||\mathpzc{Z}_{t}]$ is causally generated by the Markov transition matrix $\mathbb{T}_{F}(x_{t+1}|x_{t},z_{t+1})$ , which depends only on the past phenotypic state $x_{t}$ and the signal $z_{t+1}$ Kramer (1998); Permuter, Kim, and Weissman (2011). The single bar $|$ is also used for the normal conditioning of a path probability that does not necessarily satisfy the causal relation between conditioning and conditioned histories. We also define the path-wise (historical) fitness of a phenotypic history $\mathpzc{X}_{t}$ under an environmental history $\mathpzc{Y}_{t}$ as

[TABLE]

where $K[\mathpzc{X}_{t},\mathpzc{Y}_{t}]$ is defined over all $\{\mathpzc{X}_{t},\mathpzc{Y}_{t}\}\in\mathfrak{S}_{\mathpzc{X}}\times\mathfrak{S}_{\mathpzc{Y}}$ . With these path-wise quantitiesKobayashi and Sughiyama (2015), we obtain the population size of the organisms at time $t$ , the past phenotypic history of which is $\mathpzc{X}_{t}$ as

[TABLE]

Because $\mathcal{N}^{\mathpzc{Y},\mathpzc{Z}}_{t}=\sum_{\mathpzc{X}_{t}}\mathcal{N}_{t}^{\mathpzc{Y}_{t},\mathpzc{Z}_{t}}[\mathpzc{X}_{t}]$ , the cumulative fitness with sensing $\Psi^{s}[\mathpzc{Y}_{t},\mathpzc{Z}_{t}]$ is explicitly described as

[TABLE]

From this representation, the fitness can be described variationally Sughiyama et al. (2015); Kobayashi and Sughiyama (2015) as

[TABLE]

where $\mathcal{D}[\mathbb{P}||\mathbb{P}^{\prime}]:=\sum_{\mathpzc{X}_{t}}\mathbb{P}[\mathpzc{X}_{t}]\ln\mathbb{P}[\mathpzc{X}_{t}]/\mathbb{P}^{\prime}[\mathpzc{X}_{t}]$ is the Kullback-Leibler divergence (relative entropy)Kullback and Leibler (1951); Cover and Thomas (2012) between two path measures $\mathbb{P}$ and $\mathbb{P}^{\prime}$ . It should be noted that both KL divergence and Kramer’s causal conditioning use the double bar $||$ but their meanings are different. The maximization is achieved with the time-backward retrospective path probability defined by

[TABLE]

If the phenotypic switching does not depend on the sensing signal as $\mathbb{P}_{F}[\mathpzc{X}_{t}||\mathpzc{Z}_{t}]=\mathbb{P}_{F}[\mathpzc{X}_{t}]$ , which corresponds to the bet-hedging by random phenotypic switching, $\mathbb{P}_{B}^{s}$ is reduced to

[TABLE]

where the superscript $b$ denotes bet-hedging and $\Psi^{b}[\mathpzc{Y}_{t}]:=\ln\left<e^{K[\mathpzc{X}_{t},\mathpzc{Y}_{t}]}\right>_{\mathbb{P}_{F}[\mathpzc{X}_{t}]}$ . Because $\mathcal{N}_{t}^{\mathpzc{Y},\mathpzc{Z}}[\mathpzc{X}_{t}]$ is the number of organisms with phenotypic history $\mathpzc{X}_{t}$ in the population at time $t$ , the second equality in eq. (9) indicates that $\mathbb{P}^{s}_{B}[\mathpzc{X}_{t}|\mathpzc{Y}_{t},\mathpzc{Z}_{t}]$ is the fraction of the organisms that has phenotpyic history $\mathpzc{X}_{t}$ . This property of $\mathbb{P}^{s}_{B}[\mathpzc{X}_{t}|\mathpzc{Y}_{t},\mathpzc{Z}_{t}]$ leads to an interpretation of $\mathbb{P}^{s}_{B}[\mathpzc{X}_{t}|\mathpzc{Y}_{t},\mathpzc{Z}_{t}]$ as the probability of observing a certain phenotypic history $\mathpzc{X}_{t}$ under a realization of the environmental and signal histories $\mathpzc{Y}_{t}$ and $\mathpzc{Z}_{t}$ when we sample an organism in the population at time $t$ and track its phenotypic history in a time-backward manner, retrospectivelyBaake and Georgii (2006); Wakamoto, Grosberg, and Kussell (2012); Sughiyama et al. (2015); Lambert and Kussell (2015); Kobayashi and Sughiyama (2015). Because the organisms grow more if their phenotypic histories are more adaptive than others for the given environmental realization $\mathpzc{Y}_{t}$ , the chance to observe a certain phenotypic history $\mathpzc{X}_{t}$ after selection under $\mathpzc{Y}_{t}$ is biased to $\mathbb{P}^{s}_{B}[\mathpzc{X}_{t}|\mathpzc{Y}_{t},\mathpzc{Z}_{t}]$ from the probability $\mathbb{P}_{F}[\mathpzc{X}_{t}||\mathpzc{Z}_{t}]$ with which the same phenotypic history is intrinsically generated. Because the selected phenotypic histories strongly depend on the actual realization of the environmental history, $\mathbb{P}^{s}_{B}[\mathpzc{X}_{t}|\mathpzc{Y}_{t},\mathpzc{Z}_{t}]$ is conditional on $\mathpzc{Y}_{t}$ . It should be noted that $\mathbb{P}_{B}^{s}[\mathpzc{X}_{t}|\mathpzc{Y}_{t},\mathpzc{Z}_{t}]$ is not necessarily causal, because of which we use the normal conditioning $|$ Sughiyama et al. (2015); Kobayashi and Sughiyama (2015).

III.2 Decomposition of fitness

In order to understand the relation between fitness and information obtained by sensing, we decompose the cumulative fitnesses into biologically relevant components. To obtain the decompositions, we first define a constant $\phi_{0}$ and a probability distribution $q_{0}(y)$ by using $k_{\max}(y)$ as $\phi_{0}:=-\ln\sum_{y}e^{-k_{\max}(y)}$ and $q_{0}(y):=e^{\phi_{0}}e^{-k_{\max}(y)}$ . From these definitions, $k_{\max}(y)$ can be described as $e^{k_{max}(y)}=e^{\phi_{0}}/q_{0}(y)$ . By defining $\mathbb{Q}_{0}[\mathpzc{Y}_{t}]:=\prod_{\tau=0}^{t-1}q_{0}(y_{\tau+1})$ , $\mathbb{P}_{K}[\mathpzc{Y}_{t}||\mathpzc{X}_{t}]:=\prod_{\tau=0}^{t-1}\mathbb{T}_{K}(y_{\tau+1}|x_{\tau+1})$ , and $\Phi_{0}:=t\phi_{0}$ , we obtain the following decomposition of $K$ :

[TABLE]

where we use eq. (1). By defining $K_{\max}[\mathpzc{Y}_{t}]:=\sum_{\tau=0}^{t-1}k_{max}(y_{\tau+1})=\Phi_{0}-\ln\mathbb{Q}_{0}[\mathpzc{Y}_{t}]$ , $K[\mathpzc{X}_{t},\mathpzc{Y}_{t}]$ can also be described as

[TABLE]

With eqs. (10) and (11), for the bet-hedging problem, we obtain a decomposition of the fitness

[TABLE]

For a given environmental statistics $\mathbb{Q}[\mathpzc{Y}_{t}]$ , the fitness is represented by the ratio of the time-forward and time-backward path probabilities as

[TABLE]

where $\mathbb{P}^{b}_{B}[\mathpzc{X}_{t},\mathpzc{Y}_{t}]:=\mathbb{P}^{b}_{B}[\mathpzc{X}_{t}|\mathpzc{Y}_{t}]\mathbb{Q}[\mathpzc{Y}_{t}]$ is the time-backward joint probability of the phenotypic and environmental histories $\mathpzc{X}_{t}$ and $\mathpzc{Y}_{t}$ . Here we also define

[TABLE]

If the organisms can perfectly foresee that the environmental state at time $\tau$ becomes $y_{\tau}$ and if they can choose the phenotype that allocates all metabolic resource to the environmental state $y_{\tau}$ , the maximum replication rate at time $\tau$ reaches $e^{k_{\max}(y_{\tau})}$ . Therefore, $K_{\max}[\mathpzc{Y}_{t}]$ is interpreted as the maximum replication over an environmental path $\mathpzc{Y}_{t}$ that can be achieved only when the organisms perfectly foresee what kind of environmental history will be realized in advance. In contrast, $\ln\mathbb{Q}[\mathpzc{Y}_{t}]$ is the entropic loss of fitness due to the lack of knowledge of which environmental history will be realizedKussell and Leibler (2005); Rivoire and Leibler (2011). Therefore, $\Psi_{0}[\mathpzc{Y}_{t}]$ is the maximum replication when the organisms cannot know which environmental state will be realized but know the statistics of the future environmental state. The relevance of this interpretation and the biological meaning of some quantities such as $\Phi_{0}$ and $\mathbb{Q}_{0}[\mathpzc{Y}_{t}]$ are explicitly shown by using the FRs derived in the following sections.

For the case with the sensing signal, we can similarly obtain a decomposition of $\Psi^{s}[\mathpzc{Y}_{t},\mathpzc{Z}_{t}]$ as

[TABLE]

where $\mathbb{P}_{B}^{s}[\mathpzc{X}_{t},\mathpzc{Y}_{t},\mathpzc{Z}_{t}]:=\mathbb{P}_{B}^{s}[\mathpzc{X}_{t}|\mathpzc{Y}_{t},\mathpzc{Z}_{t}]\mathbb{Q}[\mathpzc{Y}_{t},\mathpzc{Z}_{t}]$ is the time-backward joint path probability among $\mathpzc{X}_{t}$ , $\mathpzc{Y}_{t}$ , and $\mathpzc{Z}_{t}$ . It should be noted that $\mathbb{P}_{K}[\mathpzc{Y}_{t}||\mathpzc{X}_{t}]\mathbb{P}_{F}[\mathpzc{X}_{t}||\mathpzc{Z}_{t}]\mathbb{Q}[\mathpzc{Z}_{t}|\mathpzc{Y}_{t}]$ is not a joint path probability because of the circular noncausal dependency among $\mathpzc{X}_{t}$ , $\mathpzc{Y}_{t}$ , and $\mathpzc{Z}_{t}$ . Finally, by using the decomposition in eq. (11), the time-backward conditional path probabilities, eq. (10) and eq. (9), are reduced to

[TABLE]

where the normalization factors are

[TABLE]

Because $\mathbb{P}_{F}[\mathpzc{X}_{t}]$ is the probability to intrinsically generate the repertoire of phenotypic histories in the population and $\mathbb{P}_{K}[\mathpzc{Y}_{t}||\mathpzc{X}_{t}]$ is the probability that an organism with a phenotypic history $\mathpzc{X}_{t}$ allocates resources to each history of environment $\mathpzc{Y}_{t}$ , the normalization factor $\mathbb{P}_{K,F}[\mathpzc{Y}_{t}]$ can be interpreted as the marginal resource allocation to the environmental history $\mathpzc{Y}_{t}$ at the population level. $\mathbb{P}_{K,F}[\mathpzc{Y}_{t}|\mathpzc{Z}_{t}]$ can similarly be interpreted as the population-level resource allocation to $\mathpzc{Y}_{t}$ when signal history $\mathpzc{Z}_{t}$ is received. By using FRs in the next section, we clarify that $\mathbb{P}_{F}[\mathpzc{X}_{t}]$ and $\mathbb{P}_{K,F}[\mathpzc{Y}_{t}|\mathpzc{Z}_{t}]$ also have meaning as the conjugate environment under which the given strategy $\{\mathbb{T}_{F},\mathbb{T}_{K}\}$ becomes optimal.

IV Causal FRs for bet-hedging strategy

By rearranging the decomposition of $\Psi^{b}[\mathpzc{Y}_{t}]$ in eq. (14), we can immediately obtain a detailed causal FR for fitness difference $\Psi_{0}[\mathpzc{Y}_{t}]-\Psi^{b}[\mathpzc{Y}_{t}]$ as

[TABLE]

where we use eq. (17) to obtain the last equality. The first equality means that the fitness difference is the log ratio of time-forward and time-backward path probabilities $\mathbb{P}_{K}[\mathpzc{Y}_{t}||\mathpzc{X}_{t}]\mathbb{P}_{F}[\mathpzc{X}_{t}]$ and $\mathbb{P}_{B}^{b}[\mathpzc{X}_{t},\mathpzc{Y}_{t}]$ . $\mathbb{P}_{K}[\mathpzc{Y}_{t}||\mathpzc{X}_{t}]\mathbb{P}_{F}[\mathpzc{X}_{t}]$ is the time-forward probability of observing an organism that takes phenotypic history $\mathpzc{X}_{t}$ and then allocates $\mathbb{P}_{K}[\mathpzc{Y}_{t}||\mathpzc{X}_{t}]$ of metabolic resources to $\mathpzc{Y}_{t}$ a priori to selection by conducting time-forward tracking of the histories. $\mathbb{P}_{B}^{b}[\mathpzc{X}_{t},\mathpzc{Y}_{t}]$ is the time-backward probability of observing the realization of environmental history $\mathpzc{Y}_{t}$ and the time-backward phenotypic history $\mathpzc{X}_{t}$ a posteriori to selection by conducting time-backward tracking of the histories. The second equality also indicates that the fitness difference is the log ratio of the percentage of resource allocated to environmental history $\mathpzc{Y}_{t}$ at the population level and the probability of observing environmental history $\mathpzc{Y}_{t}$ .

By averaging eq. (19) with respect to $\mathbb{P}_{B}^{b}[\mathpzc{X}_{t},\mathpzc{Y}_{t}]$ or $\mathbb{Q}[\mathpzc{Y}_{t}]$ , we can derive an integral FR as

[TABLE]

If we average eq. (19) after taking the logarithm of both sides, we obtain an average FR as

[TABLE]

where $\left<\Psi^{b}\right>_{\mathbb{Q}}:=\left<\Psi^{b}[\mathpzc{Y}_{t}]\right>_{\mathbb{Q}[\mathpzc{Y}_{t}]}$ , $\left<\Psi_{0}\right>_{\mathbb{Q}}:=\left<\Psi_{0}[\mathpzc{Y}_{t}]\right>_{\mathbb{Q}[\mathpzc{Y}_{t}]}$ , and

[TABLE]

Because of the non-negativity of the relative entropy $\mathcal{D}^{b}_{\mathrm{loss}}$ , we can easily see that $\left<\Psi_{0}\right>_{\mathbb{Q}}$ is an upper bound of the average fitness $\left<\Psi^{b}\right>_{\mathbb{Q}}$ of a bet-hedging strategy:

[TABLE]

where we use the fact that $\Psi_{0}[\mathpzc{Y}_{t}]$ is dependent neither on the phenotype switching strategy $\mathbb{T}_{F}$ nor on the metabolic allocation strategy $\mathbb{T}_{K}$ . These FRs are basically the same as those we derived in our previous work by using a variational approach. It should be also noted that the FRs derived by Mustonen and LässigMustonen and Lässig (2010) are different from ours because their relations are based on a model describing the dynamics of an ensemble of populations whereas ours is one describing the dynamics of a population.

IV.1 Biological meaning of $\Psi_{0}$ and $\Phi_{0}$

From eq. (15), the upper bound of the average fitness, $\left<\Psi_{0}\right>_{\mathbb{Q}}$ admits two different representations:

[TABLE]

where $\mathcal{S}[\mathbb{Q}]:=-\left<\ln\mathbb{Q}[\mathpzc{Y}_{t}]\right>_{\mathbb{Q}[\mathpzc{Y}_{t}]}$ is the entropy of $\mathbb{Q}[\mathpzc{Y}_{t}]$ and $\left<K_{\max}\right>_{\mathbb{Q}}$ is the average fitness under environmental statistics $\mathbb{Q}[\mathpzc{Y}_{t}]$ that is attained only when organisms have perfect knowledge of the future environment. Therefore, the first equality means that the randomness of the environment quantified by the entropy $\mathcal{S}[\mathbb{Q}]$ works as the inevitable loss of fitness due to the lack of knowledge of which environmental history will be realized in the future. If the environment fluctuates more unpredictably, we have a higher $\mathcal{S}[\mathbb{Q}]$ and a lower upper bound of the average fitness. Note that these properties of $\left<\Psi_{0}\right>_{\mathbb{Q}}$ have been pointed out previously and repeatedly Haccou and Iwasa (1995); Kussell and Leibler (2005); Rivoire and Leibler (2011).

The meaning of $\Phi_{0}$ and $\mathbb{Q}_{0}[\mathpzc{Y}_{t}]$ in the second equality becomes explicit by considering the minimization of $\left<\Psi_{0}\right>_{\mathbb{Q}}$ with respect to $\mathbb{Q}[\mathpzc{Y}_{t}]$ as follows:

[TABLE]

where we use

[TABLE]

This relation indicates that $\Phi_{0}$ is the minimum of the maximum average fitness and that $\mathbb{Q}_{0}[\mathpzc{Y}_{t}]$ is the worst environment for the organisms under which the maximum average fitness is minimized. This min-max characterization of $\Phi_{0}$ and $\mathbb{Q}_{0}[\mathpzc{Y}_{t}]$ has been clarified in the context of game theory with a matrix formulationPugatch, Barkai, and Tlusty (2013).

IV.2 Verification of FRs for fitness

Equations (19–21) indicate that, under quite general situations, the fitness difference $\Psi_{0}[\mathpzc{Y}_{t}]-\Psi^{b}[\mathpzc{Y}_{t}]$ satisfies the FRs Kobayashi and Sughiyama (2015) as the entropy production does in stochastic thermodynamicsSeifert (2012); Sagawa (2012). To demonstrate the relations, we consider an organism with two phenotypic states growing in a Markovian environment with three states as depicted in Fig. 2 (A) and (B). The three environmental states, $\mathfrak{s}^{y}_{1}$ , $\mathfrak{s}^{y}_{2}$ , and $\mathfrak{s}^{y}_{3}$ , describe nutrient A rich, nutrient B rich, and nutrient-poor conditions, respectively (Fig 2(A)). The two phenotypic states, $\mathfrak{s}^{x}_{1}$ and $\mathfrak{s}^{x}_{2}$ , employ strategies to allocate $70\%$ of the metabolic resources to $\mathfrak{s}^{y}_{1}$ and $\mathfrak{s}^{y}_{2}$ , respectively. Both states allocate $10\%$ resources to the rest of two states (see Appendix B for more details). This setting abstractly and simply represents the fact that organisms generally have much less phenotypic and sensing states than possible environmental states because of the limited physical complexity of the organismscomment1 .

Figures 3 (A) and (B) show the population dynamics of the organisms with the phenotypic states $\mathfrak{s}^{x}_{1}$ and $\mathfrak{s}^{x}_{2}$ under two different realizations of the environmental history alongside $\Psi^{b}[\mathpzc{Y}_{t}]$ and $\Psi_{0}[\mathpzc{Y}_{t}]$ . Depending on the actual realization of the environment, the relative relations among $\mathcal{N}_{t}^{\mathpzc{Y}}(\mathfrak{s}^{x}_{1})$ , $\mathcal{N}_{t}^{\mathpzc{Y}}(\mathfrak{s}^{x}_{2})$ , $e^{\Psi^{b}[\mathpzc{Y}_{t}]}=\mathcal{N}_{t}^{\mathpzc{Y}}$ , and $e^{\Psi_{0}[\mathpzc{Y}_{t}]}$ change over time stochastically. In Fig 3 (A), $e^{\Psi_{0}[\mathpzc{Y}_{t}]}$ is mostly greater than $\mathcal{N}_{t}^{\mathpzc{Y}}=e^{\Psi^{b}_{t}[\mathpzc{Y}_{t}]}$ , which reflects the average relation $\left<\Psi_{0}\right>_{\mathbb{Q}}\geq\left<\Psi^{b}_{t}\right>_{\mathbb{Q}}$ . On the contrary, $\mathcal{N}_{t}^{\mathpzc{Y}}$ frequently becomes greater than $e^{\Psi_{0}[\mathpzc{Y}_{t}]}$ in Fig 3 (B). Figures 3 (C) and (D) show that the environmental fluctuation induces a large fluctuation in both $\Psi^{b}[\mathpzc{Y}_{t}]$ and $\Psi_{0}[\mathpzc{Y}_{t}]$ . As shown in Fig 3 (E), although most environmental fluctuations result in positive fitness differences, that is, $\Psi_{0}[\mathpzc{Y}_{t}]-\Psi^{b}[\mathpzc{Y}_{t}]>0$ , rare environmental fluctuations lead to a negative fitness difference in a finite time interval, meaning that the fitness of the suboptimal strategy $\Psi^{b}[\mathpzc{Y}_{t}]$ outperforms the average upper bound $\Psi_{0}[\mathpzc{Y}_{t}]$ . This is analogue to the reversed heat flow in a small thermal systemJarzynski and Wójcik (2004). Such rare events are balanced to satisfy the integral FR in eq. (20) as verified numerically in Fig 3 (F).

V Biological meaning of the Fitness FRs

In the original detailed FR over pathsSagawa (2012), the entropy production is the log ratio of the path probability of a system’s trajectory and its time reversal. The average entropy production attains its minimum [math] only when the time reversibility of the system holds in the sense that the probabilities of observing the time-forward and the time-reversed trajectories are equal. Thus, the FRs are related to the extent of the time reversibility of the system. In contrast, the detailed FRs for the fitness difference (eq. (19)) is the log ratio between the probability $\mathbb{Q}[\mathpzc{Y}_{t}]$ of observing the environmental history and the percentage $\mathbb{P}_{K,F}[\mathpzc{Y}_{t}]$ of the marginal resource allocation to $\mathpzc{Y}_{t}$ , or that between the time-backward path probability $\mathbb{P}_{B}^{b}[\mathpzc{X}_{t},\mathpzc{Y}_{t}]$ and time-forward path probability $\mathbb{P}_{K}[\mathpzc{Y}_{t}||\mathpzc{X}_{t}]\mathbb{P}_{F}[\mathpzc{X}_{t}]$ . By investigating the FRs, we clarify a dualistic structure, a conjugacy of these quantities, and the time-reversal condition for the equality attained in eq. (24).

V.1 Dualistic relation between strategy and environment

The average FR in eq. (21) implies that the maximization of $\left<\Psi^{b}\right>_{\mathbb{Q}}$ with respect to the strategies is dual to the minimization of the relative entropy $\mathcal{D}_{\mathrm{loss}}^{b}$ as follows:

[TABLE]

because $\Psi_{0}[\mathpzc{Y}_{t}]$ is independent of $\mathbb{P}_{F}$ and $\mathbb{P}_{K}$ . This duality indicates that maximizing the average fitness by choosing the best strategy is equivalent to the organisms to implicitly learn and prepare for the environmental statistics $\mathbb{Q}[\mathpzc{Y}_{t}]$ so that the marginal resource allocation $\mathbb{P}_{K,F}[\mathpzc{Y}_{t}]$ to $\mathpzc{Y}_{t}$ becomes close to $\mathbb{Q}[\mathpzc{Y}_{t}]$ because $\mathcal{D}_{\mathrm{loss}}^{b}=\mathcal{D}[\mathbb{Q}[\mathpzc{Y}_{t}]||\mathbb{P}_{K,F}[\mathpzc{Y}_{t}]]$ . The upper bound of the average fitness in eq. (24) is achieved as $\Psi^{b}[\mathpzc{Y}_{t}]=\Psi_{0}[\mathpzc{Y}_{t}]$ if and only if $\{\mathbb{T}_{F}^{\dagger},\mathbb{T}_{K}^{\dagger}\}$ satisfies $\mathbb{Q}[\mathpzc{Y}_{t}]=\mathbb{P}_{K,F}^{\dagger}[\mathpzc{Y}_{t}]$ where $\mathbb{P}_{K,F}^{\dagger}[\mathpzc{Y}_{t}]:=\sum_{\mathpzc{X}_{t}}\mathbb{T}_{K}^{\dagger}[\mathpzc{Y}_{t}||\mathpzc{X}_{t}]\mathbb{T}_{F}^{\dagger}[\mathpzc{X}_{t}]$ , meaning that the environmental statistics and the marginal resource allocation match perfectly.

V.2 Meaning of $\mathbb{P}_{F,K}[\mathpzc{Y}_{t}]$ as conjugate environment

For a given environment, the strategy that achieves the bound may not always exist. In contrast, for a given pair of strategies $\{\mathbb{T}_{F},\mathbb{T}_{K}\}$ , there always exists the environment $\mathbb{Q}^{\dagger}[\mathpzc{Y}_{t}]$ under which the pair achieves the bound

[TABLE]

and

[TABLE]

is satisfied. Because of the duality shown in eq. (28), $\mathbb{Q}^{\dagger}[\mathpzc{Y}_{t}]$ is explicitly obtained as $\mathbb{Q}^{\dagger}[\mathpzc{Y}_{t}]=\mathbb{P}_{F,K}[\mathpzc{Y}_{t}]$ . Therefore, $\mathbb{P}_{K,F}[\mathpzc{Y}_{t}]$ can also be regarded as the conjugate environment to the strategy $\{\mathbb{T}_{F},\mathbb{T}_{K}\}$ or $\{\mathbb{P}_{F},\mathbb{P}_{K}\}$ under which they are optimal. Therefore, the fitness difference in eq. (19) is the log ratio of the actual environmental statistics $\mathbb{Q}[\mathpzc{Y}_{t}]$ and that of the conjugate environment $\mathbb{Q}^{\dagger}[\mathpzc{Y}_{t}]=\mathbb{P}_{K,F}[\mathpzc{Y}_{t}]$ of the given strategy.

$\Psi^{b}[\mathpzc{Y}_{t}]$ is bounded by $\Psi_{0}[\mathpzc{Y}_{t}]$ on average, and therefore, the optimal strategy that attains $\Psi_{0}[\mathpzc{Y}_{t}]$ cannot be invaded by any other strategy if we consider an infinitely large population and the asymptotic dynamics. Thus, the optimal strategy is a version of ESS in a fluctuating environment. Within a finite time interval, however, $\Psi^{b}[\mathpzc{Y}_{t}]$ becomes greater than $\Psi_{0}[\mathpzc{Y}_{t}]$ under certain realizations of environment $\mathpzc{Y}_{t}$ that satisfy $\mathbb{Q}[\mathpzc{Y}_{t}]<\mathbb{Q}^{\dagger}[\mathpzc{Y}_{t}]$ . In stochastic thermodynamics, such realizations correspond to the temporal reversed heat flow in a small thermal systemJarzynski and Wójcik (2004). In biology, they temporally violate ESS because a suboptimal strategy $\{\mathbb{T}_{F},\mathbb{P}_{K}\}$ under $\mathbb{Q}$ can outperform the optimal strategy or near optimal strategies with the aid of the environmental fluctuation within a finite time interval. The integral FR in eq. (20) tells us that the violation of ESS can always occur with a small but finite probability in a finite time interval. If the population size of the optimal strategy is finite, such violation can leads to extinction of the optimal population with a finite probabilityKing and Masel (2007). Moreover, the detailed FR in eq. (19) implies that a greater violation can occur under a realization of environmental history $\mathpzc{Y}_{t}$ if $\mathbb{Q}[\mathpzc{Y}_{t}]$ is small but $\mathbb{Q}^{\dagger}[\mathpzc{Y}_{t}]$ is large. This fact can be intuitively understood as follows: the greater violation is induced by the environmental history $\mathpzc{Y}_{t}$ that rarely occurs in the actual environmental statistics $\mathbb{Q}[\mathpzc{Y}_{t}]$ but is adaptive and advantageous for the given strategy. A crucial fact is that this intuitive understanding is supported by a quantitative relation as in eq. (19). Furthermore, the detailed FR suggests that greater violation can occur for specific suboptimal strategies than for others if $\mathbb{Q}[\mathpzc{Y}_{t}]$ contains many rare environmental histories. In contrast, if $\mathbb{Q}[\mathpzc{Y}_{t}]$ is perfectly random as $\mathbb{Q}[\mathpzc{Y}_{t}]=\mathrm{const.}$ , the extent of the violation is limited and all the suboptimal strategies have only an even chance of violation. This implies that structured environmental fluctuation promotes impactful violation by some specific suboptimal strategies.

V.3 Time reversibility and optimality

If $\Psi^{b}[\mathpzc{Y}_{t}]=\Psi_{0}[\mathpzc{Y}_{t}]$ holds, moreover, the time-backward probability $\mathbb{P}_{B}^{b,\dagger}[\mathpzc{X}_{t},\mathpzc{Y}_{t}]$ and the time-forward path probability $\mathbb{P}_{K}^{\dagger}[\mathpzc{Y}_{t}||\mathpzc{X}_{t}]\mathbb{P}_{F}^{\dagger}[\mathpzc{X}_{t}]$ become equal:

[TABLE]

where we use the first equality in eq. (23). Marginalization of this equality with respect to $\mathpzc{Y}_{t}$ indicates that the marginalized time-backward path probability $\mathbb{P}_{B}^{b}[\mathpzc{X}_{t}]$ satisfies the consistency condition $\mathbb{P}_{B}^{b,\dagger}[\mathpzc{X}_{t}]=\mathbb{P}_{F}^{\dagger}[\mathpzc{X}_{t}]$ as shown previouslyKobayashi and Sughiyama (2015). Thus, the optimal strategies to achieve the bound have time reversibility in the sense that the ensembles of the time-forward and time-backward phenotypic histories are indistinguishable without knowing the actual environmental history that the population experienced. While the definition of time reversibility is different, this result is closely related to the fact that the average entropy production attains [math] when the time reversibility is satisfied. Biologically, this result is quite important because we can evaluate the optimality of an organism in a changing environment by just observing its phenotypic dynamics without directly measuring the environment that the organisms experience. It should be noted, however, that $\mathbb{P}_{B}^{b,\dagger}[\mathpzc{X}_{t}]=\mathbb{P}_{F}^{\dagger}[\mathpzc{X}_{t}]$ is a necessary but not sufficient condition for $\Psi^{b}[\mathpzc{Y}_{t}]=\Psi_{0}[\mathpzc{Y}_{t}]$ .

V.4 Achievability of the fitness upper bound

The duality eq. (28) indicates that the achievability of the bound in eq. (24) depends on the actual property of the environmental statistics and biological constraints on the selectable strategies. If the environment is a time-homogeneous Markov process as $\mathbb{Q}[\mathpzc{Y}_{t}]=\prod_{\tau=0}^{t-1}\mathbb{T}_{E}(y_{\tau+1}|y_{\tau})q(y_{0})$ and if the number of possible phenotypic states is the same as that of the environmental states, that is, $\#\mathfrak{S}_{\mathpzc{X}}=\#\mathfrak{S}_{\mathpzc{Y}}$ , the bound can be achieved by a pair of strategies:

[TABLE]

where $\mathbb{T}_{E}(x^{\prime}|x):=\left.\mathbb{T}_{E}(y_{\tau+1}|y_{\tau})\right|_{y_{\tau+1}=x^{\prime},y_{\tau}=x}$ and $\delta_{x,y}$ is the Kronecker delta. This pair is equivalent to the optimal bet-hedging strategy in Kelly’s horse race gambling because $\mathbb{T}_{K}^{\dagger}(y|x)=\delta_{x,y}$ means that organisms can survive only when their phenotypic state matches the current environmental state; they die out otherwise. To be specific, we call $\mathbb{T}_{K}(y|x)=\delta_{x,y}$ Kelly’s strategy of metabolic allocation. Under biologically realistic constraints, however, Kelly’s strategy cannot be the optimal strategy because the possible phenotypic states are usually much fewer than those of the environment as in Fig 2 and 3. Allocating all resources to a specific environment easily leads to extinction if the phenotypic states cannot cover all the possible environmental states. With a limited capacity in possible phenotypic states, $\mathcal{D}_{\mathrm{loss}}^{b}=0$ can be attained only if the environmental fluctuation has a hidden structure the dimensionality of which is sufficiently low. This is manifested by the fact that the conjugate environment $\mathbb{Q}^{\dagger}$ is the environment with a hidden dynamics $\mathbb{P}_{F}[\mathpzc{X}^{\prime}_{t}]$ that generates the actual environmental history as $\mathbb{Q}^{\dagger}[\mathpzc{Y}_{t}]=\sum_{\mathpzc{X}^{\prime}_{t}}\mathbb{P}_{K}[\mathpzc{Y}_{t}|\mathpzc{X}^{\prime}_{t}]\mathbb{P}_{F}[\mathpzc{X}^{\prime}_{t}]$ . Because such a low dimensional structure may not always exist, however,

[TABLE]

is generally not zero but finite and positive under a biological constraint that the possible phenotypic states are fewer than environmental ones. Therefore, $\left<\Psi_{0}\right>_{\mathbb{Q}}$ is generally attained only by a Darwinian demon that cannot perfectly foresee the future environment but has sufficient capacity in its phenotypic properties to perfectly learn and prepare for the environmental fluctuation $\mathbb{Q}[\mathpzc{Y}_{t}]$ .

Even when the bound is not achieved, $\mathcal{D}_{\mathrm{loss}}^{b^{\dagger}}$ has explicit meaning as the component in the environmental fluctuation that cannot be learned or represented by the dynamics of the cell’s strategy. For example, when $\mathbb{T}_{F}$ is memoryless and $\mathbb{Q}[\mathpzc{Y}_{t}]$ is a stationary Markov process with the stationary probability $q(y)$ , that is, $\mathbb{Q}[\mathpzc{Y}_{t}]=\prod_{\tau=0}^{t-1}\mathbb{T}_{E}^{F}(y_{\tau+1}|y_{\tau})q(y_{0})$ , then

[TABLE]

where $\mathcal{I}^{x_{t};x_{t+1}}:=\mathcal{D}\Big{[}\mathbb{T}_{E}^{F}(y^{\prime}|y)q(y))\Big{|}\Big{|}q(y)q(y^{\prime})\Big{]}$ is the mutual information that measures the correlation of environmental states between consecutive time points, which cannot be learned or mimicd by the memoryless phenotypic switching.

VI FRs with signal sensing

Next, we consider the case in which the organisms can exploit the information obtained from the sensing signal. The fitness decomposition in eq. (16) can be rearranged as

[TABLE]

where $i[\mathpzc{Y}_{t};\mathpzc{Z}_{t}]:=\ln\mathbb{Q}[\mathpzc{Y}_{t},\mathpzc{Z}_{t}]/\mathbb{Q}[\mathpzc{Y}_{t}]\mathbb{Q}[\mathpzc{Z}_{t}]$ is the bare mutual information between $\mathpzc{Y}_{t}$ and $\mathpzc{Z}_{t}$ and we use eq. (18) to derive the last equality. From this, we can similarly obtain detailed, integral, and average FRs with information as follows:

[TABLE]

and

[TABLE]

where $\mathcal{I}^{\mathpzc{Y};\mathpzc{Z}}:=\left<i[\mathpzc{Y}_{t};\mathpzc{Z}_{t}]\right>_{\mathbb{Q}[\mathpzc{Y}_{t},\mathpzc{Z}_{t}]}$ is the path-wise mutual information between $\mathpzc{Y}_{t}$ and $\mathpzc{Z}_{t}$ , and

[TABLE]

The way information terms involved in Eqs. (35)-(37) is the same as the way those appearing in the Sagawa-Ueda relations, where the Maxwell demon and feedback regulation are involvedSagawa (2012). Because of the non-negativity of $\mathcal{D}_{\mathrm{loss}}^{s}$ , $\left<\Psi^{s}\right>_{\mathbb{Q}}$ is upper bounded by $\left<\Psi_{0}\right>_{\mathbb{Q}}+\mathcal{I}^{\mathpzc{Y};\mathpzc{Z}}$ :

[TABLE]

where $\mathcal{S}_{\mathpzc{Y}|\mathpzc{Z}}[\mathbb{Q}]:=-\left<\ln\mathbb{Q}[\mathpzc{Y}_{t}|\mathpzc{Z}_{t}]\right>_{\mathbb{Q}[\mathpzc{Y}_{t},\mathpzc{Z}_{t}]}$ is the conditional entropy of $\mathbb{Q}[\mathpzc{Y}_{t},\mathpzc{Z}_{t}]$ . If the history of signal $\mathpzc{Z}_{t}$ has perfect information on the history of $\mathpzc{Y}_{t}$ , the upper bound reaches $\left<K_{\max}\right>_{\mathbb{Q}}$ . As in the bet-hediging situation, maximization of the average fitness $\left<\Psi^{s}\right>_{\mathbb{Q}}$ with sensing is also dual to the minimization of the relative entropy $\mathcal{D}_{\mathrm{loss}}^{s}$ :

[TABLE]

where we use $*$ to denote the optimal $\mathbb{T}_{F}$ and $\mathbb{T}_{K}$ with sensing. The duality indicates that $\left<\Psi^{s}\right>_{\mathbb{Q}}$ achieves the bound $\left<\Psi_{0}\right>_{\mathbb{Q}}+\mathcal{I}^{\mathpzc{Y};\mathpzc{Z}}$ only when $\mathbb{P}^{*}_{K,F}[\mathpzc{Y}_{t}|\mathpzc{Z}_{t}]=\mathbb{Q}[\mathpzc{Y}_{t}|\mathpzc{Z}_{t}]$ holds. As in the bet-heding, if $\mathbb{P}^{*}_{K,F}[\mathpzc{Y}_{t}|\mathpzc{Z}_{t}]=\mathbb{Q}[\mathpzc{Y}_{t}|\mathpzc{Z}_{t}]$ holds, the time backward path probability $\mathbb{P}_{B}^{s,*}[\mathpzc{X}_{t},\mathpzc{Y}_{t},\mathpzc{Z}_{t}]$ equals the time-forward path probability:

[TABLE]

Marginalization of this equation leads to the consistency condition $\mathbb{P}_{F}^{*}[\mathpzc{X}_{t}||\mathpzc{Z}_{t}]=\mathbb{P}_{B}^{s,*}[\mathpzc{X}_{t}|\mathpzc{Z}_{t}](:=\sum_{\mathpzc{Y}_{t}}\mathbb{P}_{B}^{s,*}[\mathpzc{X}_{t}|,\mathpzc{Y}_{t},\mathpzc{Z}_{t}]\mathbb{Q}[\mathpzc{Y}_{t}|\mathpzc{Z}_{t}])$ derived in referenceKobayashi and Sughiyama (2015). Moreover, $\mathbb{P}_{K,F}[\mathpzc{Y}_{t}|\mathpzc{Z}_{t}]\mathbb{Q}[\mathpzc{Z}_{t}]$ is the conjugate environment and signal of a given pair of strategies $\{\mathbb{T}_{F},\mathbb{T}_{K}\}$ under which it achieves the bound. From the detailed FR in eq. (35), we also see that the fitness of a given strategy can exceed the bound by chance as $\Psi^{s}[\mathpzc{Y}_{t},\mathpzc{Z}_{t}]>\Psi_{0}[\mathpzc{Y}_{t}]+i[\mathpzc{Y}_{t};\mathpzc{Z}_{t}]$ when the realized pair of environmental and signal histories $\{\mathpzc{Y}_{t},\mathpzc{Z}_{t}\}$ appears more frequently in the conjugate environment than in the actual environment as $\mathbb{P}_{K,F}[\mathpzc{Y}_{t}|\mathpzc{Z}_{t}]\mathbb{Q}[\mathpzc{Z}_{t}]>\mathbb{Q}[\mathpzc{Y}_{t},\mathpzc{Z}_{t}]$ .

VI.1 Achievability of the bound and causality

The necessary and sufficient condition $\mathbb{P}^{*}_{K,F}[\mathpzc{Y}_{t}|\mathpzc{Z}_{t}]=\mathbb{Q}[\mathpzc{Y}_{t}|\mathpzc{Z}_{t}]$ for achieving the bound means that the optimal metabolic allocation and phenotype switching strategy together implement the Bayesian computation of the posterior distribution $\mathbb{Q}[\mathpzc{Y}_{t}|\mathpzc{Z}_{t}]$ of $\mathpzc{Y}_{t}$ given the history of the sensed signal $\mathpzc{Z}_{t}$ . Under the constraint that $\mathbb{P}_{K,F}[\mathpzc{Y}_{t}|\mathpzc{Z}_{t}]$ satisfies a causal relation as $\mathbb{P}_{K,F}[\mathpzc{Y}_{t}|\mathpzc{Z}_{t}]=\sum_{\mathpzc{X}_{t}}\mathbb{P}_{K}[\mathpzc{Y}_{t}||\mathpzc{X}_{t}]\mathbb{P}_{F}[\mathpzc{X}_{t}||\mathpzc{Z}_{t}]$ , $\mathbb{P}^{*}_{K,F}[\mathpzc{Y}_{t}|\mathpzc{Z}_{t}]=\mathbb{Q}[\mathpzc{Y}_{t}|\mathpzc{Z}_{t}]$ cannot be realized in general because $\mathbb{Q}[\mathpzc{Y}_{t}|\mathpzc{Z}_{t}]$ does not necessarily satisfy the causality relation between $\mathpzc{Y}_{t}$ and $\mathpzc{Z}_{t}$ . If, for example, the metabolic allocation strategy is of Kelly’s type as $\mathbb{T}_{K}(y|x)=\delta_{y,x}$ , the phenotypic switching strategy must satisfy $\mathbb{P}^{*}_{F}[\mathpzc{X}_{t}||\mathpzc{Z}_{t}]=\mathbb{Q}[\mathpzc{X}_{t}|\mathpzc{Z}_{t}]$ to achieve the bound where $\mathbb{Q}[\mathpzc{X}_{t}|\mathpzc{Z}_{t}]:=\left.\mathbb{Q}[\mathpzc{Y}_{t}|\mathpzc{Z}_{t}]\right|_{\mathpzc{Y}_{t}=\mathpzc{X}_{t}}$ . By definition, $\mathbb{P}_{F}[\mathpzc{X}_{t}||\mathpzc{Z}_{t}]$ should satisfy the causal relation that $x(t)$ depends only on the past and/or current states of $z(t)$ . However, $\mathbb{Q}[\mathpzc{Y}_{t}|\mathpzc{Z}_{t}]$ may not necessarily be causal because the conditioning by histories biases the past states of the conditioned history. For example, consider the case that the signal $\mathpzc{Z}_{t}$ is causally generated from the environment $\mathpzc{Y}_{t}$ and there is no feedback from $\mathpzc{Z}_{t}$ to $\mathpzc{Y}_{t}$ as in Fig.2(A) and (C). If we observe $\mathpzc{Z}_{t}=\{\cdots,z(t-2),z(t-1),z(t)\}=\{\cdots,\mathfrak{s}^{z}_{1},\mathfrak{s}^{z}_{1},\mathfrak{s}^{z}_{2}\}$ , we may infer that the environmental state changes from $\mathfrak{s}^{y}_{1}$ to $\mathfrak{s}^{y}_{2}$ at time $t$ . However, if we further observe $\{z(t+1),z(t+2)\}=\{\mathfrak{s}^{z}_{1},\mathfrak{s}^{z}_{1}\}$ , then we may change our prediction such that $y(t)=\mathfrak{s}^{y}_{1}$ and $z(t)=\mathfrak{s}^{z}_{2}$ was simply an error of the signal. This intuitive observation illustrates that the inferred state of $y(t)$ can be affected by the future observation of the signal, e.g., $z(t+1)$ and $z(t+2)$ . For exactly the same reason, in $\mathbb{Q}[\mathpzc{Y}_{t}|\mathpzc{Z}_{t}]$ , the past state in the environmental history $\mathpzc{Y}_{t}$ is affected by the future observations of the signal in $\mathpzc{Z}_{t}$ . Because of the dependency on future state of the signal, $\mathbb{Q}[\mathpzc{Y}_{t}|\mathpzc{Z}_{t}]$ cannot be represented causally in general.

An exception is that the environmental history is causally and memorylessly generated from the signal as $\mathbb{Q}[\mathpzc{Y}_{t}||\mathpzc{Z}_{t}]=\prod_{\tau=0}^{t}\mathbb{T}_{E}(y_{\tau}|z_{\tau})$ , which is not realistic because we usually expect the signal to be generated from environment and not vice versa. Thus,

[TABLE]

is not generally zero, and $\mathcal{D}_{\mathrm{loss}}^{s}$ contains not only the loss due to suboptimality of strategies, but also the loss from the causal constraints in exploiting the information of $\mathpzc{Z}_{t}$ for phenotype switching.

VII Directed information and Causal FRs with sensing

By taking account of the causality relation between $\mathpzc{Y}_{t}$ and $\mathpzc{Z}_{t}$ in more detail, we can obtain tighter FRs, which illustrate the problem of the causality more explicitly. Let us additionally assume that $\mathbb{Q}[\mathpzc{Y}_{t},\mathpzc{Z}_{t}]$ has a causal and Markov relation as depicted in Fig. 4 (A). The relation is represented as

[TABLE]

where $\mathbb{Q}_{F}[\mathpzc{Z}_{t}||\mathpzc{Y}_{t}]:=\prod_{\tau=0}^{t-1}\mathbb{T}_{E}^{F}(z_{\tau+1}|y_{\tau+1},z_{\tau})q(z_{0}|y_{0})$ and $\mathbb{Q}_{F}[\mathpzc{Y}_{t}||\mathpzc{Z}_{t-1}]:=\prod_{\tau=0}^{t-1}\mathbb{T}_{E}^{F}(y_{\tau+1}|y_{\tau},z_{\tau})q(y_{0})$ . Here, the signal $z(t)$ and environment $y(t)$ are Markovean and dependent on the past states of the environment and the signal. By applying Bayes’ theorem, $\mathbb{Q}[\mathpzc{Y}_{t},\mathpzc{Z}_{t}]$ admits another causal decomposition (Fig. 4 (B)) with reversed causality between $\mathpzc{Y}_{t}$ and $\mathpzc{Z}_{t}$ :

[TABLE]

where

[TABLE]

$\mathbb{Q}_{B}[\mathpzc{Y}_{t}||\mathpzc{Z}_{t}]:=\prod_{t=0}^{t-1}\mathbb{T}_{E}^{B}(y_{t+1}|z_{t+1},y_{t},z_{t})q(y_{0}|z_{0})$ , and $\mathbb{Q}_{B}[\mathpzc{Z}_{t}||\mathpzc{Y}_{t-1}]:=\prod_{t=0}^{t-1}\mathbb{T}_{E}^{B}(z_{t+1}|y_{t},z_{t})q(z_{0})$ . Because of the meaning of Bayes’ theorem, $\mathbb{T}_{E}^{B}(y_{t+1}|z_{t+1},y_{t},z_{t})$ is interpreted as the posterior distribution of $y_{t+1}$ inferred from observations, $z_{t+1}$ , $y_{t}$ , and $z_{t}$ . Thus, $\mathbb{Q}_{B}[\mathpzc{Y}_{t}||\mathpzc{Z}_{t}]$ is equivalent to the posterior path probability of the environment inferred by the sequential Bayesian inference of $y_{t+1}$ given the observation of the signal $z_{t}$ and the past state of the environment $y_{t}$ . Note that $\mathbb{T}_{E}^{B}(y_{t+1}|z_{t+1},y_{t},z_{t})$ is slightly different from the usual sequential Bayesean inference to estimate the hidden state $y(t)$ only from observation $\mathpzc{Z}_{t}$ in which we cannot use the past environmental state $y_{t}$ for inference. Moreover, $\mathbb{Q}_{B}[\mathpzc{Y}_{t}||\mathpzc{Z}_{t}]$ is also different from $\mathbb{Q}_{B}[\mathpzc{Y}_{t}|\mathpzc{Z}_{t}]$ because the posterior path probability $\mathbb{Q}_{B}[\mathpzc{Y}_{t}|\mathpzc{Z}_{t}]$ is computed after observing the whole history of $\mathpzc{Z}_{t}$ rather than by conducting Bayesian inference sequentially as in $\mathbb{Q}_{B}[\mathpzc{Y}_{t}||\mathpzc{Z}_{t}]$ .

Equation (44) yields a tighter decomposition of the fitness:

[TABLE]

where $i[\mathpzc{Z}_{t}\to\mathpzc{Y}_{t}]:=\ln\mathbb{Q}[\mathpzc{Y}_{t},\mathpzc{Z}_{t}]/\mathbb{Q}_{B}[\mathpzc{Z}_{t}||\mathpzc{Y}_{t-1}]\mathbb{Q}[\mathpzc{Y}_{t}]$ is the bare directed information from $\mathpzc{Z}_{t}$ to $\mathpzc{Y}_{t}$ Permuter, Kim, and Weissman (2011). This decomposition leads to the following detailed FR:

[TABLE]

Because $\mathbb{P}_{K}[\mathpzc{Y}_{t}||\mathpzc{X}_{t}]\mathbb{P}_{F}[\mathpzc{X}_{t}||\mathpzc{Z}_{t}]\mathbb{Q}_{B}[\mathpzc{Z}_{t}||\mathpzc{Y}_{t-1}]$ forms a joint path probability among $\mathpzc{X}_{t}$ , $\mathpzc{Y}_{t}$ , and $\mathpzc{Z}_{t}$ , which satisfies $\sum_{\mathpzc{X}_{t},\mathpzc{Y}_{t},\mathpzc{Z}_{t}}\mathbb{P}_{K}[\mathpzc{Y}_{t}||\mathpzc{X}_{t}]\mathbb{P}_{F}[\mathpzc{X}_{t}||\mathpzc{Z}_{t}]\mathbb{Q}_{B}[\mathpzc{Z}_{t}||\mathpzc{Y}_{t-1}]=1$ , we obtain the integral and average FRs as

[TABLE]

and

[TABLE]

where

[TABLE]

and $\mathcal{I}^{\mathpzc{Z}_{t}\to\mathpzc{Y}_{t}}:=\mathcal{D}\Big{[}\mathbb{Q}[\mathpzc{Y}_{t},\mathpzc{Z}_{t}]\Big{|}\Big{|}\mathbb{Q}_{B}[\mathpzc{Z}_{t}||\mathpzc{Y}_{t-1}]\mathbb{Q}[\mathpzc{Y}_{t}]\Big{]}$ . $\mathcal{I}^{\mathpzc{Z}_{t}\to\mathpzc{Y}_{t}}$ is the directed information that quantifies the amount of information $\mathpzc{Z}_{t}$ has for inferring $\mathpzc{Y}_{t}$ Permuter, Kim, and Weissman (2011). If $\mathbb{Q}_{B}[\mathpzc{Y}_{t}||\mathpzc{Z}_{t}]=\mathbb{Q}[\mathpzc{Y}_{t}]$ , $\mathcal{I}^{\mathpzc{Z}_{t}\to\mathpzc{Y}_{t}}=0$ as well as $i[\mathpzc{Z}_{t}\to\mathpzc{Y}_{t}]=0$ hold, which means that $\mathpzc{Z}_{t}$ is useless for inferring $\mathpzc{Y}_{t}$ causally. $\left<\Psi_{0}\right>_{\mathbb{Q}}+\mathcal{I}^{\mathpzc{Z}_{t}\to\mathpzc{Y}_{t}}$ in eq. (50) is a tighter bound for the average fitness with sensing because

[TABLE]

holds, where $\mathcal{I}^{\mathpzc{Y}_{t}\to\mathpzc{Z}_{t}}:=\mathcal{D}\Big{[}\mathbb{Q}[\mathpzc{Y}_{t},\mathpzc{Z}_{t}]\Big{|}\Big{|}\mathbb{Q}_{B}[\mathpzc{Y}_{t}||\mathpzc{Z}_{t}]\mathbb{Q}[\mathpzc{Z}_{t}]\Big{]}$ is the directed information from $\mathpzc{Y}_{t}$ to $\mathpzc{Z}_{t}$ . Moreover, $\mathcal{I}^{\mathpzc{Y}_{t}\to\mathpzc{Z}_{t}}$ is the residual information in $\mathcal{I}^{\mathpzc{Y};\mathpzc{Z}}$ that has no fitness value under the causal constraints.

VII.1 Numerical verification of FRs with sensing

To demonstrate the validity of FRs in Eqs. (48)-(50), we conduct a modified numerical simulation of the population dynamics to incorporate sensing (see also Appendix B). The stochastic laws of environmental dynamics $\mathbb{T}_{E}^{F}(y_{t+1}|y_{t},z_{t})$ (Fig.2 (A)) and the metabolic allocation strategy $\mathbb{T}_{K}(y_{t}|x_{t})$ (Fig.2 (B)) are the same as those of the simulations in Fig. 3. The sensing signal is assumed to have two states, $\mathfrak{s}^{z}_{1}$ and $\mathfrak{s}^{z}_{2}$ (Fig. 2 (C)). If the environment is either in $\mathfrak{s}^{y}_{1}$ or in $\mathfrak{s}^{y}_{2}$ , the signal gives the corresponding $\mathfrak{s}^{z}_{1}$ and $\mathfrak{s}^{z}_{2}$ with $90\%$ accuracy. If the environment is in $\mathfrak{s}^{y}_{3}$ , the signal produces either $\mathfrak{s}^{z}_{1}$ or $\mathfrak{s}^{z}_{2}$ with eqaul probability (Fig. 2 (C)). Because the signal is memorylessly generated from the environment at every time point, the environment and signal together form a Markov relation as depicted in Fig. 4 (C). By employing the sensed signal, the organisms switch to the phenotypic state $\mathfrak{s}^{x}_{1}$ with a $95\%$ chance when the sensed signal is $\mathfrak{s}^{z}_{1}$ and to $\mathfrak{s}^{x}_{2}$ when the signal is $\mathfrak{s}^{z}_{2}$ as shown in Fig. 2 (D).

Figures 5 (A) and (B) show two trajectories of the population size under the same realizations of environment as in Figs. 3 (A) and (B), respectively. In Fig. 5 (A), the total population size $\mathcal{N}_{t}^{\mathpzc{Y},\mathpzc{Z}}=e^{\Psi^{s}[\mathpzc{Y}_{t},\mathpzc{Z}_{t}]}$ (dashed line) is less than $e^{\Psi_{0}[\mathpzc{Y}_{t}]+i[\mathpzc{Z}_{t}\to\mathpzc{Y}_{t}]}$ (red line) whereas $\mathcal{N}_{t}^{\mathpzc{Y},\mathpzc{Z}}$ frequently becomes greater than $e^{\Psi_{0}[\mathpzc{Y}_{t}]+i[\mathpzc{Z}_{t}\to\mathpzc{Y}_{t}]}$ in Fig. 5 (B). Even with sensing, both $\Psi^{s}[\mathpzc{Y}_{t},\mathpzc{Z}_{t}]$ and $\Psi_{0}[\mathpzc{Y}_{t}]+i[\mathpzc{Z}_{t}\to\mathpzc{Y}_{t}]$ fluctuate substantially as shown in Figs. 5 (C) and (D) depending on the realization of the environment and signal. Due to this fluctuation, $\Psi^{s}[\mathpzc{Y}_{t},\mathpzc{Z}_{t}]$ sometimes become greater than $e^{\Psi_{0}[\mathpzc{Y}_{t}]+i[\mathpzc{Z}_{t}\to\mathpzc{Y}_{t}]}$ as in Fig. 5 (E) to satisfy the integral FR (49) as demonstrated in Fig. 5 (F).

VII.2 Achievability of the bound

The bound in eq. (50) can be achieved when $\mathbb{P}_{K,F}^{*}[\mathpzc{Y}_{t}|\mathpzc{Z}_{t}]=\mathbb{Q}_{B}[\mathpzc{Y}_{t}||\mathpzc{Z}_{t}]$ holds, which also means that $\mathbb{Q}^{*}[\mathpzc{Y}_{t},\mathpzc{Z}_{t}]:=\mathbb{Q}^{*B}[\mathpzc{Y}_{t}||\mathpzc{Z}_{t}]\mathbb{Q}_{B}[\mathpzc{Z}_{t}||\mathpzc{Y}_{t-1}]$ is the conjugate environment for a given $\{\mathbb{T}_{F},\mathbb{T}_{K}\}$ where $\mathbb{Q}^{*B}[\mathpzc{Y}_{t}||\mathpzc{Z}_{t}]=\mathbb{P}_{K,F}[\mathpzc{Y}_{t}|\mathpzc{Z}_{t}]$ . Because $\mathbb{Q}_{B}[\mathpzc{Y}_{t}||\mathpzc{Z}_{t}]$ is the path obtained by a type of sequential Bayesian inference $\mathbb{T}_{E}^{B}(y_{t+1}|z_{t+1},y_{t},z_{t})$ , the bound is attained if phenotype switching with sensing and metabolic allocation, as a whole, implement the same Bayesian computation as $\mathbb{T}_{E}^{B}(y_{t+1}|z_{t+1},y_{t},z_{t})$ . However, to estimate $y_{t+1}$ , the Bayesian inference represented by $\mathbb{T}_{E}^{B}(y_{t+1}|z_{t+1},y_{t},z_{t})$ uses not only $z_{t+1}$ and $z_{t}$ but also the error-less information on the past environmental state $y_{t}$ . If organisms do not have any way to obtain perfect information on the past environment by sensing as they do not for biologically realistic situations, the metabolic allocation strategy should be of Kelly’s type as $\mathbb{T}_{K}^{*}(y_{t}|x_{t})=\delta_{y_{t},x_{t}}$ to achieve the bound without additional constraints on the environment. With Kelly’s strategy, the organisms can effectively obtain information on the past environment because the past phenotypic state $x(t)$ of the survived organisms becomes the same as the past environmental state $y(t)$ under this situation. The optimal phenotype switching with Kelly’s strategy for metabolic allocation should be

[TABLE]

to attain the bound. Another situation in which the bound can be achieved is when the environmental state $y_{t+1}$ depends not on $y_{t}$ but only on $z_{t}$ as $\mathbb{T}_{E}(y_{t+1}|y_{t},z_{t})=\mathbb{T}_{E}(y_{t+1}|z_{t})$ . We then have

[TABLE]

where $\mathbb{T}_{E}^{B}(y_{t+1}|z_{t+1},y_{t},z_{t})$ is reduced to be independent of $y_{t}$ as $\mathbb{T}_{E}^{B}(y_{t+1}|z_{t+1},y_{t},z_{t})=\mathbb{T}_{E}^{B}(y_{t+1}|z_{t+1},z_{t})$ . Thus, the bound can be attained if the phenotypic switching and metabolic allocation strategies satisfy

[TABLE]

Note that the phenotypic switching need not be dependent on the past phenotypic state in this situation because $z(t+1)$ and $z(t)$ contain all the relevant information on the future state of $y(t+1)$ .

For both cases, the achievability of the bound is directly linked to the accessibility of the organisms to the past information that directly drives the current environmental state. In addition, even when the bound is not achieved, $\mathcal{D}_{\mathrm{loss}}^{s,d^{*}}$ explicitly represents the loss of fitness associated with the inaccessibility.

VII.3 Losses due to limited capacity and suboptimality

For general metabolic allocation strategies other than Kelly’s type, the phenotypic states of the survived organisms contain only imperfect information on the past environmental history. In addition, the environment and signal may not always form the circular dependency $\dots\to y_{t}\to z_{t}\to y_{t+1}\to z_{t+1}\to\dots$ . In such situations, the loss is most generally represented by $\mathcal{D}_{\mathrm{loss}}^{s}$ , which quantifies the difference between the time-backward path probability $\mathbb{P}_{B}[\mathpzc{X}_{t},\mathpzc{Y}_{t},\mathpzc{Z}_{t}]$ and the time-forward path probability $\mathbb{P}_{K}[\mathpzc{Y}_{t}||\mathpzc{X}_{t}]\mathbb{P}_{F}[\mathpzc{X}_{t}||\mathpzc{Z}_{t}]\mathbb{Q}[\mathpzc{Z}_{t}]$ . The loss from different sources can be represented by using the following dissection of the fitness:

[TABLE]

where $i_{B}^{s}[\mathpzc{Y}_{t},\mathpzc{Z}_{t}|\mathpzc{X}_{t}]:=\ln\mathbb{P}_{B}^{s}[\mathpzc{Y}_{t},\mathpzc{Z}_{t}|\mathpzc{X}_{t}]/\mathbb{P}_{B}^{s}[\mathpzc{Y}_{t}|\mathpzc{X}_{t}]\mathbb{P}_{B}^{s}[\mathpzc{Z}_{t}|\mathpzc{X}_{t}]$ is the bare conditional backward mutual information. Although this representation does not immediately admit detailed and integral FRs, we can obtain the following decomposition of $\mathcal{D}_{\mathrm{loss}}^{s}$ into three terms:

[TABLE]

where

[TABLE]

$\mathcal{I}_{B}^{\mathpzc{Y};\mathpzc{Z}|\mathpzc{X}}$ is the residual information on the environment that the signal still has even if we know the retrospective history of the phenotype. Because the retrospective correlation between the phenotype and environment is induced by the selection, this residual information represents the information that is not used in the selection. If the signal contains certain information on the environmental fluctuation that is nothing to do with the replication of the organisms, for example, the information on the existence of unmetabolizable artificial molecules, such information cannot be exploited in the phenotypic switching to choose a better phenotype for the current environmental state. Thus, $\mathcal{I}_{B}^{\mathpzc{Y};\mathpzc{Z}|\mathpzc{X}}$ measures the amount of such useless information for fitness that the sensing signal conveys. $\mathcal{D}_{\mathrm{loss}}^{s,K}$ accounts for the imperfectness of the metabolic allocation strategy, and $\mathcal{D}_{\mathrm{loss}}^{s,F}$ quantifies the imperfectness of phenotype switching due to the causal constraint and suboptimality.

For example, in the case of the causally optimal strategy under Kelly’s metabolic allocation in eq. (53) with the causal structure in $\mathbb{Q}[\mathpzc{Y}_{t},\mathpzc{Z}_{t}]$ (eq. (43)), these quantities are reduced to

[TABLE]

The first equation, $\mathcal{I}_{B}^{\mathpzc{Y};\mathpzc{Z}|\mathpzc{X}}=0$ , holds because organisms with the metabolic allocation strategy of Kelly’s type can survive only when their phenotypic history is identical to the actual environmental history, and the retrospective phenotypic history $\mathpzc{X}_{t}$ contains perfect information on $\mathpzc{Y}_{t}$ . Thus, the residual $\mathcal{I}_{B}^{\mathpzc{Y};\mathpzc{Z}|\mathpzc{X}}$ becomes [math]. Similarly, $\mathcal{D}_{\mathrm{loss}}^{s,K}=0$ because only organisms with their phenotypic history identical to their environmental hisotry can survive, and $\mathbb{P}_{B}^{s}[\mathpzc{Y}_{t}|\mathpzc{X}_{t}]=\delta_{\mathpzc{X}_{t},\mathpzc{Y}_{t}}$ holds for Kelly’s strategy. In contrast, $\mathcal{D}_{\mathrm{loss}}^{s,F}$ cannot be zero because the phenotypic switching strategy is causally constrained. The minimum loss due to this constraint is the directed information $\mathcal{I}^{\mathpzc{Y}_{t}\to\mathpzc{Z}_{t}}$ that measures the causally useless information in $\mathcal{I}^{\mathpzc{Y};\mathpzc{Z}}$ .

In a general situation, these three quantities are mutually related. For example, the amount of useless information for fitness $\mathcal{I}_{B}^{\mathpzc{Y};\mathpzc{Z}|\mathpzc{X}}$ depends on the choice of strategies. If phenotype switching does not use any information obtained from $z(t)$ , then $\mathcal{I}_{B}^{\mathpzc{Y};\mathpzc{Z}|\mathpzc{X}}$ becomes $\mathcal{I}_{B}^{\mathpzc{Y};\mathpzc{Z}}$ . They are appropriate information-theoretic quantities that account for the irrelevance of sensing and the imperfectness of metabolic allocation and phenotypic switching for exploiting the sensed information.

VIII Summary and Discussion

In this paper, we clarified the stochastic and information thermodynamic structures in population dynamics with and without sensing, to derive the bound of fitness and the fitness gain by sensing, in the form of FRs. The detailed and integral FRs obtained substantially generalized the previous results on the average fitness value of information by showing that not only average but also the fluctuation of fitness is generically constrained as is the entropy production. Such constraints manifest the possibility and the condition that the fitness of organisms with suboptimal strategies can be greater than that of the optimal one by chance due to environmental fluctuations, just as there is a finite probability of observing reversal of heat flow in a small thermal system. Such rare events can be regarded as a violation of ESS because they may induce the takeover of the optimal organism by a suboptimal one in a finite population. Nevertheless, the violation is somehow ruled to follow the integral and detailed FRs. Moreover, the directed information is derived to be the tighter measure of the fitness value of information, in which the causal structure in the problem is explicitly accounted for. The condition for achieving this bound is shown to be related to the ability of the organisms to conduct or implement a type of Bayesian sequential inference from the sensed information. Finally, we derived three quantities that can measure the irrelevance of sensing and the imperfectness of metabolic allocation and phenotypic switching for exploiting the sensed information.

All these results and generalizations are derived by employing the path-wise and the time-backward representation of population dynamics. Among others, pivotal is the duality between the maximization of the average fitness and the minimization of the difference between the time-forward and time-backward path probabilities (eqs. (28) and (40)). The minimum loss of the average fitness due to causality is clearly described in this dual problem as the deviation between the causal time-forward path probability and the non-causal time-backward path probability (eq. (38)). We believe that the path-wise and retrospective formulation of the population dynamics also play indispensable roles in addressing other biologically relevant problems.

VIII.1 Learning rules

In this paper as well as in most of the previous works, the processes of attaining better strategies by mutation or learning are rarely considered directly and explicitlyBeaumont et al. (2009); Lambert and Kussell (2014), except in a referenceXue and Leibler (2016). As we have demonstrated, the average optimal strategy may not dominate a population all the time in a fluctuating environment because of a rare event: the takeover by suboptimal strategies. Thus, the organisms and their corresponding strategies are expected to change non-stationarily over time. In such a situation, an advantage may be gained by organisms that acquire an ability to adaptively change and learn the strategies as our brain and immune system doMayer et al. (2016). However, the fitness to be increased by adaptation is a population-level quantity whereas adaptation is conducted at the individual-level. The duality might be used to resolve this problem. As demonstrated in Kelly’s strategy in the most extreme fashion, the survived organisms can retrospectively obtain certain information on the environment they experienced. As shown recentlyXue and Leibler (2016), such information can be used to adjust time-forward strategies to switch phenotype and to allocate resource so as to reduce the discrepancy between time-forward and backward phenotypic histories rather than directly increasing the population fitness. By employing our path-wise formulation, we may obtain more general results on the adaptive learning of strategies and its evolutionary advantages.

VIII.2 Individual sensing

Another problem that should be addressed is the type of sensing. In this paper, the sensing signal is treated as an extrinsic factor by assuming that all organisms have the same sensing signal. Such an assumption is valid only when the sensing noise of the organisms is negligibly small. A more general and biologically realistic situation is the individual sensing in which each organism receives different sensing signals owing to the sensing noise intrinsic to the organismPerkins and Swain (2009); Ben-Jacob and Schultz (2010); Kobayashi and Kamimura (2012); Brennan, Cheong, and Levchenko (2012). The fitness value of individual sensing has been rarely addressed, with the exception of a pioneering workRivoire and Leibler (2011), which shows that the fitness value of individual sensing can be greater than the mutual information under restricted situationsRivoire and Leibler (2011). A generalization of this result may be achieved by using our pat-hwise and retrospective formulation.

VIII.3 Thermodynamics and evolution

As we have clarified, thermodynamics and adaptation share the same fundamental mathematical structure. Nevertheless, most attempts to bridge thermodynamics and adaptation or evolution, including ours, are just formal in the sense that the similarity shown is at the level of mathematics Lotka (1922a, b); Iwasa (1988); de Vladar and Barton (2011); Qian (2014). However, an actual thermodynamics underlies the processes of replication and sensing of organismsEngland (2013, 2015); Pugatch (2015). The thermodynamics must constrain the rate of replication and the efficiency of sensing, and the latter was investigated intensively in the context of stochastic and information thermodynamics, recentlyBarato, Hartich, and Seifert (2014); Govern and ten Wolde (2014); Wolde et al. (2016). Our relation between fitness and information must be consistent with the constraints imposed by the thermodynamics of these processes. Integration of the relation between entropy and information with that between fitness and information would be an indispensable step toward establishing the real thermodynamics of adaptation and evolution.

Acknowledgements.

We acknowledge Yuichi Wakamoto, Takashi Nozoe, and Yoichiro Takahashi for useful discussions. This research is supported partially by an JST PRESTO program, JSPS KAKENHI 16H06155 and 16K17763, the Platform for Dynamic Approaches to Living System from MEXT and AMED, Japan, and the 2016 Inamori Research Grants Program, Japan.

Appendix A Decomposition of replication rate

Let us consider a given replication rate $k(x,y)$ . $e^{k(x,y)}$ determines a $(\#\mathfrak{S}_{x}\times\#\mathfrak{S}_{y})$ matrix, and we define its row vectors by $\{\mathbf{v}_{x}\in(\mathbb{R}_{\geq 0})^{\#\mathfrak{S}_{y}}|v_{x,y}=e^{k(x,y)},\,x\in\mathfrak{S}_{x}\}$ . When the number of the environmental states is equal to or greater than the phenotypic ones as $\#\mathfrak{S}_{y}\geq\#\mathfrak{S}_{x}$ , $\{\mathbf{v}_{x}|x\in\mathfrak{S}_{x}\}$ can form a hyperplane, and we can fine a vector $\mathbf{u}\in(\mathbb{R}_{\geq 0})^{\#\mathfrak{S}_{y}}$ that is orthogonal to the plane. Let $\mathbf{q}_{0}$ be the normalization of $\mathbf{u}$ as $\mathbf{q}_{0}=\mathbf{u}/\sum_{y}u(y)$ . By definition, $\mathbf{q}_{0}$ satisfies $\mathbf{v}_{x}\cdot\mathbf{q}_{0}=\sum_{y}e^{k(x,y)}q_{0}(y)=\phi_{0}$ for all $x\in\mathfrak{S}_{x}$ where $\phi_{0}>0$ is a constant. This means that we can define a conditional probability $\mathbb{T}_{K}(y|x)$ as

[TABLE]

Thus, we have the decomposition of $e^{k(x,y)}$ in eq. (1) and its variant used in eq. (11) as

[TABLE]

where $e^{k_{\max}(y)}:=\phi_{0}/q_{0}(y)$ . Thus, when the environmental states are more complex than the phenotypic ones as $\#\mathfrak{S}_{y}\geq\#\mathfrak{S}_{x}$ , the decomposition of the replication rate, eq. (1), is general enough even if we do not explicitly assume the relation between metabolic allocation strategy and replication rate as in eq. (1).

If $\#\mathfrak{S}_{y}<\#\mathfrak{S}_{x}$ , the decomposition does not necessarily exist. Such a situation may occur when an organism has redundant phenotypic states, a fraction of which can be effectively realized by linear combinations of the others. While the redundant phenotypic states can appear in the process of mutation, $\#\mathfrak{S}_{y}<\#\mathfrak{S}_{x}$ is not biologically realistic because the environment is generally more complex than phenotype.

Finally, we note that the decomposition, eq. (59), is not unique except when $\#\mathfrak{S}_{x}=\#\mathfrak{S}_{y}$ and $\{\mathbf{v}_{x}|x\in\mathfrak{S}_{x}\}$ are linearly independent each others. Nevertheless, our FRs hold irrespective of the choice of the decomposition. Achievability of the maximum average fitness, eq. (24), is affected by the choice because $\Phi_{0}$ depends on the decomposition and because $\mathbb{T}(y|x)$ must be changed so that it satisfies eq. (59) for a given $k(x,y)$ . While we apparently have a freedom to choose the metabolic allocation strategy, $\mathbb{T}_{K}(y|x)$ , for maximization of $\left<\Psi^{b}\right>_{\mathbb{Q}}$ in eq. (24), eq. (24) is virtually reduced to the problem of the maximization of the average fitness by changing only phenotypic switching strategy $\mathbb{P}_{F}$ as

[TABLE]

because $\Psi^{b}$ is independent of the choice of the decomposition of $e^{k(x,y)}$ . This is the problem addressed in our previous workKobayashi and Sughiyama (2015). Tight upper bound of $\max_{\mathbb{T}_{F}}\left<\Psi^{b}\right>_{\mathbb{Q}}$ can be obtained by minimization of $\left<\Psi_{0}\right>_{\mathbb{Q}}$ as

[TABLE]

We again stress that our integral and detailed FRs generally hold for any given $\mathbb{T}_{F}$ and $\mathbb{T}_{K}$ .

Appendix B Numerical verification of FRs

For the numerical simulations shown in Figs 3 and 5, we consider the case that the environmental dynamics is Markovean as $\mathbb{T}_{E}^{F}(y_{t+1}|y_{t})$ , and that the signal is memorylessly generated from the environment as $\mathbb{T}_{E}^{F}(z_{t+1}|z_{t+1})$ .

Realizations of the environmental history $\mathpzc{Y}_{t}$ and the sensing signal $\mathpzc{Z}_{t}$ are obtained by conducting a finite-state Markov transition by following $\mathbb{T}_{E}^{F}(y_{t+1}|y_{t})$ and $\mathbb{T}_{E}^{F}(z_{t+1}|y_{t+1})$ . For a given pair of realizations $\{\mathpzc{Y}_{t},\mathpzc{Z}_{t}\}$ , the fitnesses $\Psi^{b}[\mathpzc{Y}_{t}]$ (without sensing) and $\Psi^{s}[\mathpzc{Y}_{t}]$ (with sensing) are calculated recursively as

[TABLE]

where $\mathcal{N}^{\mathpzc{Y}}_{t}$ and $\mathcal{N}^{\mathpzc{Y},\mathpzc{Z}}_{t}$ are obtained by solving eq. (2) recursively for the given $\{\mathpzc{Y}_{t},\mathpzc{Z}_{t}\}$ . Similarly, $\Psi_{0}[\mathpzc{Y}_{t}]$ is recursively computed as

[TABLE]

The point-wise directed information $i[\mathpzc{Z}_{t}\to\mathpzc{Y}_{t}]$ in eq. (48) and eq. (49) is recursively obtained as

[TABLE]

where $\mathbb{T}_{E}^{B}(z_{t+1}|y_{t})=\sum_{y_{t+1}}\mathbb{T}_{E}^{F}(z_{t+1}|y_{t+1})\mathbb{T}_{E}^{F}(y_{t+1}|y_{t})$ .

Note that, when a feedback relation exists between $y_{t}$ and $z_{t}$ , the computation of $\Psi_{0}[\mathpzc{Y}_{t}]$ and $i[\mathpzc{Z}_{t}\to\mathpzc{Y}_{t}]$ becomes more demanding because $\mathbb{Q}[\mathpzc{Y}_{t}]$ has to be obtained by marginalizing $\mathpzc{Z}_{t}$ in $\mathbb{Q}[\mathpzc{Y}_{t},\mathpzc{Z}_{t}]$ .

References

Chevin, Lande, and Mace (2010) L.-M. Chevin, R. Lande, and G. M. Mace, “Adaptation, plasticity, and extinction in a changing environment: Towards a predictive theory,” PLoS Biol 8, e1000357–8 (2010).
Frank (2011a) S. A. Frank, “Natural selection. I. Variable environments and uncertain returns on investment*,” J. Evol. Biol. 24, 2299–2309 (2011a).
Slatkin (1974) M. Slatkin, “Hedging one’s evolutionary bets.,” Nature 250, 704–705 (1974a).
Philippi and Seger (1989) T. Philippi and J. Seger, “Hedging one’s evolutionary bets, revisited.” Trends in Ecol Evol 4, 41–44 (1989).
Balaban et al. (2004) N. Q. Balaban, J. Merrin, R. Chait, L. Kowalik, and S. Leibler, “Bacterial persistence as a phenotypic switch.” Science 305, 1622–1625 (2004).
Wakamoto et al. (2013) Y. Wakamoto, N. Dhar, R. Chait, K. Schneider, F. Signorino-Gelo, S. Leibler, and J. D. McKinney, “Dynamic persistence of antibiotic-stressed mycobacteria,” Science 339, 91–95 (2013).
Cohen (1966) D. Cohen, “Optimizing reproduction in a randomly varying environment,” J. Theor. Biol. 12, 119–129 (1966).
Cohen (1971) D. Cohen, “Maximizing final yield when growth is limited by time or by limiting resources.” J. Theor. Biol. 33, 299–307 (1971).
Kobayashi and Sughiyama (2015) T. J. Kobayashi and Y. Sughiyama, “Fluctuation relations of fitness and information in population dynamics,” Phys. Rev. Lett. 115, 238102–238105 (2015).
Dhar and McKinney (2007) N. Dhar and J. D. McKinney, “Microbial phenotypic heterogeneity and antibiotic tolerance,” Curr Opin Microbiol 10, 30–38 (2007).
Beaumont et al. (2009) H. J. E. Beaumont, J. Gallie, C. Kost, G. C. Ferguson, and P. B. Rainey, “Experimental evolution of bet hedging,” Nature 461, 90–93 (2009).
Meacham and Morrison (2013) C. E. Meacham and S. J. Morrison, “Tumour heterogeneity and cancer cell plasticity,” Nature 501, 328–337 (2013).
Xue and Leibler (2016) B. Xue and S. Leibler, “Evolutionary learning of adaptation to varying environments through a transgenerational feedback” Proc. Natl. Acad. Sci. U.S.A. 113, 11266–11271 (2016).
Perkins and Swain (2009) T. J. Perkins and P. S. Swain, “Strategies for cellular decision-making.” Mol Syst Biol 5, 326 (2009).
Ben-Jacob and Schultz (2010) E. Ben-Jacob and D. Schultz, “Bacteria determine fate by playing dice with controlled odds.” Proc. Natl. Acad. Sci. U.S.A. 107, 13197–13198 (2010).
Kobayashi and Kamimura (2012) T. J. Kobayashi and A. Kamimura, “Theoretical aspects of cellular decision-making and information-processing.” Adv. Exp. Med. Biol. 736, 275–291 (2012).
Brennan, Cheong, and Levchenko (2012) M. D. Brennan, R. Cheong, and A. Levchenko, “Systems biology. How information theory handles cell signaling and uncertainty.” Science 338, 334–335 (2012).
Lotka (1922a) A. J. Lotka, “Contribution to the energetics of evolution” Proc. Natl. Acad. Sci. U.S.A. 8, 147–151 (1922a).
Lotka (1922b) A. J. Lotka, “Natural selection as a physical principle” Proc. Natl. Acad. Sci. U.S.A. 8, 151–154 (1922b).
Iwasa (1988) Y. Iwasa, “Free fitness that always increases in evolution” J. Theor. Biol. 135, 265–281 (1988).
de Vladar and Barton (2011) H. P. de Vladar and N. H. Barton, “The contribution of statistical physics to evolutionary biology,” Trends in Ecol Evol 26, 424–432 (2011).
Frank (2012a) S. A. Frank, “Natural selection. IV. The Price equation*,” J. Evol. Biol. 25, 1002–1019 (2012a).
Qian (2014) H. Qian, “Fitness and entropy production in a cell population dynamics with epigenetic phenotype switching,” Quant Biol 2, 47–53 (2014).
Mustonen and Lässig (2010) V. Mustonen and M. Lässig, “Fitness flux and ubiquity of adaptive evolution,” Proc. Natl. Acad. Sci. U.S.A. 107, 4248–4253 (2010).
Levins (1965) R. Levins, “Theory of fitness in a heterogeneous environment. V. Optimal genetic systems” Genetics 52, 891–904 (1965).
Levins (1968) R. Levins, Evolution in Changing Environments, Some Theoretical Explorations (Princeton University Press, 1968).
Rivoire (2015) O. Rivoire, “Informations in models of evolutionary dynamics,” J Stat Phys 162, 1324–1352 (2015).
Haccou and Iwasa (1995) P. Haccou and Y. Iwasa, “Optimal mixed strategies in stochastic environments,” Theor Pop Biol 47, 212–243 (1995).
Ber (2004) C. T. Bergstrom and M. Lachmann, “Shannon information and biological fitness,” IEEE Information Theory Workshop , 50–54 (2004).
Donaldson-Matasci, Bergstrom, and Lachmann (2010) M. C. Donaldson-Matasci, C. T. Bergstrom, and M. Lachmann, “The fitness value of information,” Oikos 119, 219–230 (2010).
Cover and Thomas (2012) T. M. Cover and J. A. Thomas, Elements of Information Theory (John Wiley & Sons, 2012).
Kussell and Leibler (2005) E. Kussell and S. Leibler, “Phenotypic diversity, population growth, and information in fluctuating environments,” Science 309, 2075–2078 (2005).
Frank (2012b) S. A. Frank, “Natural selection. V. How to read the fundamental equations of evolutionary change in terms of information theory,” J. Evol. Biol. 25, 2377–2396 (2012b).
Pugatch, Barkai, and Tlusty (2013) R. Pugatch, N. Barkai, and T. Tlusty, “Asymptotic Cellular growth rate as the effective information utilization rate,” arXiv , 1308.0623v3 (2013) .
(35) While the difference between phenotypic and environmental states is only $1$ , this does not affect our numerical verifications because our theory is proved for more general situations. However, if we consider more environmental states, we need more samples to verify the fluctuation relations, numerically.
Rivoire and Leibler (2011) O. Rivoire and S. Leibler, “The value of information for populations in varying environments,” J Stat Phys 142, 1124–1166 (2011).
Kelly (1956) J. L. Kelly, “A new interpretation of information rate,” Bell SysTech J 35, 917–926 (1956).
Rivoire and Leibler (2014) O. Rivoire and S. Leibler, “A model for the generation and transmission of variations in evolution.” Proc. Natl. Acad. Sci. U.S.A. 111, E1940–9 (2014).
Leibler and Kussell (2010) S. Leibler and E. Kussell, “Individual histories and selection in heterogeneous populations.” Proc. Natl. Acad. Sci. U.S.A. 107, 13183–13188 (2010).
Bianconi and Rahmede (2012) G. Bianconi and C. Rahmede, “Quantum mechanics formalism for biological evolution,” Chaos Solitons Fract 45, 555–560 (2012).
Wakamoto, Grosberg, and Kussell (2012) Y. Wakamoto, A. Y. Grosberg, and E. Kussell, “Optimal lineage principle for age-structured populations.” Evolution 66, 115–134 (2012).
Sughiyama et al. (2015) Y. Sughiyama, T. J. Kobayashi, K. Tsumura, and K. Aihara, “Pathwise thermodynamic structure in population dynamics,” Phys Rev E Stat Nonlin Soft Matter Phys 91, 032120 (2015).
Baake and Georgii (2006) E. Baake and H.-O. Georgii, “Mutation, selection, and ancestry in branching models: a variational approach,” J. Math. Biol. 54, 257–303 (2006).
Lambert and Kussell (2015) G. Lambert and E. Kussell, “Quantifying selective pressures driving bacterial evolution using lineage analysis,” Phys. Rev. X 5, 011016–10 (2015).
Demetrius and Gundlach (2014) L. Demetrius and V. Gundlach, “Directionality theory and the entropic principle of natural selection,” Entropy 16, 5428–5522 (2014).
Seifert (2012) U. Seifert, “Stochastic thermodynamics, fluctuation theorems and molecular machines,” Rep. Prog. Phys. 75, 126001 (2012).
Sagawa (2012) T. Sagawa, Thermodynamics of Information Processing in Small Systems (Springer, 2012).
King and Masel (2007) O. D. King and J. Masel, “The evolution of bet-hedging adaptations to rare scenarios,” Theoretical Population Biology 72, 560–575 (2007).
Giordano et al. (2016) N. Giordano, F. Mairet, J.-L. Gouzé, J. Geiselmann, and H. de Jong, “Dynamical allocation of cellular resources as an optimal control problem: Novel insights into microbial growth strategies,” PLoS Comput Biol 12, e1004802–28 (2016).
Kobayashi (2010) T. J. Kobayashi, “Implementation of dynamic Bayesian decision making by intracellular kinetics.” Phys. Rev. Lett. 104, 228104 (2010).
Barato, Hartich, and Seifert (2014) A. C. Barato, D. Hartich, and U. Seifert, “Nonequilibrium sensing and its analogy to kinetic proofreading,” New J. Phys. 17, 055026–19 (2014).
Kramer (1998) G. Kramer, Directed information for channels with feedback, Ph.D. thesis (1998).
Permuter, Kim, and Weissman (2011) H. H. Permuter, Y.-H. Kim, and T. Weissman, “Interpretations of directed information in portfolio theory, data compression, and hypothesis testing,” IEEE Trans. Inform. Theory 57, 3248–3259 (2011).
Kullback and Leibler (1951) S. Kullback and R. A. Leibler, “On information and sufficiency,” Ann Math Stat 22, 79–86 (1951).
Jarzynski and Wójcik (2004) C. Jarzynski and D. K. Wójcik, “Classical and quantum fluctuation theorems for heat exchange,” Phys. Rev. Lett. 92, 230602–4 (2004).
Lambert and Kussell (2014) G. Lambert and E. Kussell, “Memory and fitness optimization of bacteria under fluctuating environments,” PLoS Genet 10, e1004556–10 (2014).
Mayer et al. (2016) A. Mayer, T. Mora, O. Rivoire, and A. M. Walczak, “Diversity of immune strategies explained by adaptation to pathogen statistics,” Proc. Natl. Acad. Sci. U.S.A. 113, 8630–8635 (2016).
England (2013) J. L. England, “Statistical physics of self-replication,” J Chem Phys 139, 121923–9 (2013).
England (2015) J. L. England, “Dissipative adaptation in driven self-assembly,” Nature Publishing Group 10, 919–923 (2015).
Pugatch (2015) R. Pugatch, “Greedy scheduling of cellular self-replication leads to optimal doubling times with a log-Frechet distribution,” Proc. Natl. Acad. Sci. U.S.A. 112, 2611–2616 (2015).
Govern and ten Wolde (2014) C. C. Govern and P. R. ten Wolde, “Optimal resource allocation in cellular sensing systems,” Proc. Natl. Acad. Sci. U.S.A. 111, 17486–17491 (2014).
Wolde et al. (2016) P. R. Wolde, N. B. Becker, T. E. Ouldridge, and A. Mugler, “Fundamental limits to cellular sensing,” J Stat Phys 162, 1395–1424 (2016).

Bibliography62

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Chevin, Lande, and Mace (2010) L.-M. Chevin, R. Lande, and G. M. Mace, “Adaptation, plasticity, and extinction in a changing environment: Towards a predictive theory,” P Lo S Biol 8 , e 1000357–8 (2010).
2Frank (2011 a) S. A. Frank, “Natural selection. I. Variable environments and uncertain returns on investment*,” J. Evol. Biol. 24 , 2299–2309 (2011 a).
3Slatkin (1974) M. Slatkin, “Hedging one’s evolutionary bets.,” Nature 250 , 704–705 (1974 a).
4Philippi and Seger (1989) T. Philippi and J. Seger, “Hedging one’s evolutionary bets, revisited.” Trends in Ecol Evol 4 , 41–44 (1989).
5Balaban et al. (2004) N. Q. Balaban, J. Merrin, R. Chait, L. Kowalik, and S. Leibler, “Bacterial persistence as a phenotypic switch.” Science 305 , 1622–1625 (2004).
6Wakamoto et al. (2013) Y. Wakamoto, N. Dhar, R. Chait, K. Schneider, F. Signorino-Gelo, S. Leibler, and J. D. Mc Kinney, “Dynamic persistence of antibiotic-stressed mycobacteria,” Science 339 , 91–95 (2013).
7Cohen (1966) D. Cohen, “Optimizing reproduction in a randomly varying environment,” J. Theor. Biol. 12 , 119–129 (1966).
8Cohen (1971) D. Cohen, “Maximizing final yield when growth is limited by time or by limiting resources.” J. Theor. Biol. 33 , 299–307 (1971).

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Stochastic and Information-thermodynamic Structures of Population Dynamics in Fluctuating Environment

Abstract

pacs:

I Introduction

I.1 Adaptation in fluctuating environment

I.2 Notions of information and analogy with physics in biological adaptation

I.3 Outline of main results

II Modeling adaptation of population in changing environment

II.1 Modeling phenotype switching

II.2 Modeling metabolic resource allocation

II.3 Modeling sensing processes

II.4 Population dynamics of organisms

III Path-wise formulation and fitness decomposition

III.1 Path-wise and retrospective formulation

III.2 Decomposition of fitness

IV Causal FRs for bet-hedging strategy

IV.1 Biological meaning of Ψ0\Psi_{0}Ψ0​ and Φ0\Phi_{0}Φ0​

IV.2 Verification of FRs for fitness

V Biological meaning of the Fitness FRs

V.1 Dualistic relation between strategy and environment

V.2 Meaning of PF,K[\mathpzcYt]\mathbb{P}_{F,K}[\mathpzc{Y}_{t}]PF,K​[\mathpzcYt​] as conjugate environment

V.3 Time reversibility and optimality

V.4 Achievability of the fitness upper bound

VI FRs with signal sensing

VI.1 Achievability of the bound and causality

VII Directed information and Causal FRs with sensing

VII.1 Numerical verification of FRs with sensing

VII.2 Achievability of the bound

VII.3 Losses due to limited capacity and suboptimality

VIII Summary and Discussion

VIII.1 Learning rules

VIII.2 Individual sensing

VIII.3 Thermodynamics and evolution

Acknowledgements.

Appendix A Decomposition of replication rate

Appendix B Numerical verification of FRs

References

IV.1 Biological meaning of $\Psi_{0}$ and $\Phi_{0}$

V.2 Meaning of $\mathbb{P}_{F,K}[\mathpzc{Y}_{t}]$ as conjugate environment