Fisher Waves: an individual based stochastic model

Bahram Houchmandzadeh; Marcel Vallade

arXiv:1703.02835·physics.bio-ph·August 2, 2017

Fisher Waves: an individual based stochastic model

Bahram Houchmandzadeh, Marcel Vallade

PDF

TL;DR

This paper introduces an individual-based stochastic model for the spread of beneficial mutations in spatial populations, revealing differences from classical models at high selection and linking parameters to dispersal behavior.

Contribution

It develops a stochastic model based on the spatial Moran process that explicitly treats fluctuations and connects model parameters to dispersal kernels.

Findings

01

At high selection, the model differs from classical FKPP predictions.

02

At low selection, front behavior resembles Brownian motion with drift.

03

Diffusion and noise are determined by dispersal kernel properties.

Abstract

The propagation of a beneficial mutation in a spatially extended population is usually studied using the phenomenological stochastic Fisher-Kolmogorov (SFKPP) equation. We derive here an individual based, stochastic model founded on the spatial Moran process where fluctuations are treated exactly. At high selection pressure, the results of this model are different from the classical FKPP. At small selection pressure, the front behavior can be mapped into a Brownian motion with drift, the properties of which can be derived from microscopic parameters of the Moran model. Finally, we show that the diffusion coefficient and the noise amplitude of SFKPP are not independent parameters but are both determined by the dispersal kernel of individuals.

Equations176

\frac{\partial u}{\partial t} = D \nabla^{2} u + a u (1 - u) + b u (1 - u) η (x, t)

\frac{\partial u}{\partial t} = D \nabla^{2} u + a u (1 - u) + b u (1 - u) η (x, t)

W_{i}^{+} (n)

W_{i}^{+} (n)

W_{i}^{-} (n)

n_{i}^{''} = (n_{i - 1} + n_{i + 1} - 2 n_{i}) .

n_{i}^{''} = (n_{i - 1} + n_{i + 1} - 2 n_{i}) .

\frac{d P ( n )}{d t}

\frac{d P ( n )}{d t}

n_{i} = N \mbox i f i \leq 0; = 0 \mbox o t h er w i se

n_{i} = N \mbox i f i \leq 0; = 0 \mbox o t h er w i se

\frac{d ⟨ n _{i} ⟩}{d t}

\frac{d ⟨ n _{i} ⟩}{d t}

\frac{\partial u}{\partial t} = D [1 + s (1 - u)] \frac{\partial ^{2} u}{\partial x ^{2}} + μ s u (1 - u)

\frac{\partial u}{\partial t} = D [1 + s (1 - u)] \frac{\partial ^{2} u}{\partial x ^{2}} + μ s u (1 - u)

c_{min} = 2 μ D s (1 + s)

c_{min} = 2 μ D s (1 + s)

2 D (1 + s) / μ s κ (s)

2 D (1 + s) / μ s κ (s)

\frac{d ⟨ n _{i} ⟩}{d t}

\frac{d ⟨ n _{i} ⟩}{d t}

\frac{d ⟨ n _{i} n _{j} ⟩}{d t}

U (t)

U (t)

B (t)

Z_{p} (t) = \frac{1}{N ^{2}} i \sum {⟨ n_{i} (t) n_{i + p} (t) ⟩ - n_{i} (0) n_{i + p} (0)}

Z_{p} (t) = \frac{1}{N ^{2}} i \sum {⟨ n_{i} (t) n_{i + p} (t) ⟩ - n_{i} (0) n_{i + p} (0)}

⟨ B (t) ⟩ = - Z_{0} (t) + ⟨ U (t) ⟩ + B (0)

⟨ B (t) ⟩ = - Z_{0} (t) + ⟨ U (t) ⟩ + B (0)

V (t)

V (t)

\frac{1}{μ} \frac{d ⟨ n _{i} ⟩}{d t}

\frac{1}{μ} \frac{d ⟨ n _{i} ⟩}{d t}

\frac{1}{μ} \frac{d ⟨ n _{i} n _{j} ⟩}{d t}

\frac{d}{d t} ⟨ U (t) ⟩ = \frac{m}{2} i \sum ⟨ n_{i}^{''} ⟩ = 0

\frac{d}{d t} ⟨ U (t) ⟩ = \frac{m}{2} i \sum ⟨ n_{i}^{''} ⟩ = 0

\frac{1}{m} \frac{d Z _{0}}{d t}

\frac{1}{m} \frac{d Z _{0}}{d t}

\frac{1}{m} \frac{d Z _{p}}{d t}

C_{p} = \frac{1}{N ^{2}} i \sum n_{i} (0) n_{p + i}^{''} (0)

C_{p} = \frac{1}{N ^{2}} i \sum n_{i} (0) n_{p + i}^{''} (0)

\hat{Z}_{p} (ω) = \frac{( 1 - β ) C _{0}}{ω} \frac{z}{z ^{2} + 2 α z + 2 β - 1} \frac{1}{z ^{p}}

\hat{Z}_{p} (ω) = \frac{( 1 - β ) C _{0}}{ω} \frac{z}{z ^{2} + 2 α z + 2 β - 1} \frac{1}{z ^{p}}

\hat{Z}_{p} (ω)

\hat{Z}_{p} (ω)

Z_{p} (t)

⟨ B (t) ⟩ = - Z_{0} (t) .

⟨ B (t) ⟩ = - Z_{0} (t) .

B_{\mbox e q} = m (N - 1) /2

B_{\mbox e q} = m (N - 1) /2

⟨ B ⟩ = i \sum ⟨ b_{i} ⟩ = \frac{1}{N ^{2}} i \sum ⟨ n_{i} (N - n_{i}) ⟩

⟨ B ⟩ = i \sum ⟨ b_{i} ⟩ = \frac{1}{N ^{2}} i \sum ⟨ n_{i} (N - n_{i}) ⟩

ν_{i}

ν_{i}

β_{i}

\frac{d V}{d t}

\frac{d V}{d t}

V (t) = m t + O (m t)

V (t) = m t + O (m t)

\frac{d ⟨ U ⟩}{d t}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Fisher Waves: an individual based stochastic model.

B. Houchmandzadeh and M. Vallade

CNRS, LIPHY, F-38000 Grenoble, France

Univ. Grenoble Alpes, LIPHY, F-38000 Grenoble, France

Abstract

The propagation of a beneficial mutation in a spatially extended population is usually studied using the phenomenological stochastic Fisher-Kolmogorov (SFKPP) equation. We derive here an individual based, stochastic model founded on the spatial Moran process where fluctuations are treated exactly. At high selection pressure, the results of this model are different from the classical FKPP. At small selection pressure, the front behavior can be mapped into a Brownian motion with drift, the properties of which can be derived from microscopic parameters of the Moran model. Finally, we show that the diffusion coefficient and the noise amplitude of SFKPP are not independent parameters but are both determined by the dispersal kernel of individuals.

I Introduction.

One of the most fundamental questions in evolutionary biology is the spread of a mutant with fitness $1+s$ into a wild type population with fitness $1$ . In a non-structured population (i.e., for a population at dimension $d=0$ ), the answer to this question was found by KimuraKimura1962 nearly 50 years ago as a good approximate solution of the Fisher-Wright or the Moran model of population genetics, and better solutions of the Moran model have been proposed recentlyHouchmandzadeh2010 . For geographically structured populations however, the question is far from settled and only some specific information, such as the fixation probability, has received partial answers in a field that is now called evolutionary graph dynamicsLieberman2005 ; Houchmandzadeh2011b . For geographically structured populations where the main ingredients of the competing populations, *i.e., *the fitness, the carrying capacity and the diffusion of individuals, are independent of the space, the evolutionary dynamics has been mostly investigated through the stochastic Fisher Kolmogorov Petrovsky, Piscounov (SFKPP) equation

[TABLE]

where $u(x,t)$ is the local relative density of the mutant with respect to the local carrying capacity, $D$ is the diffusion coefficient of individuals, $a$ is proportional to the relative excess fitness of mutants; the last term is a noise term that captures the local genetic drift, where $b$ is related to the local carrying capacity and $\eta$ is a white noise. The problem that has attracted most attention is that of the front propagation : if at the initial time, one half of space is filled only with the mutant type and the other half only with the wild type, then the dynamics of the problem can be reduced to the dynamics of the front separating the two types.

The deterministic part of the equation (FKPP) was proposed by FisherFisher1937 and Kolmogorov, Petrovsky, PiscounovKolmogorov ; it has found applications in many areas of science ranging from ecology and epidemiologyMurray2007 to chemical kineticsBeuer1995 and particle physicsMunier2003 . The properties of the FKPP equation have been widely investigatedVansaarloos2003 . Specifically, this equation allows for traveling wave solutions and it is known that a stable solution of the FKPP is a wave front connecting the two regions $u=1$ and $u=0$ with velocity $c=(d/dt)\int_{\mathbb{R}}udx=2\sqrt{aD}$ and width $B=\int_{\mathbb{R}}u(1-u)dx=2\sqrt{D/a}$ .

The FKPP equation however is not well adapted to evolutionary population dynamics at small selection pressure, which is one of the relevant limits of population geneticsKingsolver2001 ; Nei2005 . The FKPP equation describes quantities (individuals, molecules,…) that at the fundamental level are discrete; the noise associated with this discreteness can play an important role in the dynamics of the front, specifically at small selection pressures. This problem was tackled phenomenologically by adding either a cutoff Brunet1997 or alternatively a noise term to the equation. The form of the noise in the SFKPP was proposed by Doering et al.Doering2003 . The SFKPP proposed by Doering et al. has now become a major mathematical tool for the investigation of Fisher Waves. It has specifically been used by Hallatschek and KorolevHallatschek2009 to investigate the properties of the front at small selection pressure, where they revealed the marked difference of the solutions with respect to the deterministic equation.

The SFKPP equation however is phenomenological and cannot be derived rigorously from a microscopic, individual based model of population genetics. Firstly, individual based models such as the Moran model are governed by discrete master equations and can be approximated by a Fokker-Planck equation, or their equivalent stochastic differential equation, only in the limit of large system size, i.e., large local carrying capacityEthier1977 ; Houchmandzadeh2010 . The local carrying capacity however does not appear explicitly in the SFKPP equation and it is difficult to assess the precision of the Focker-Planck approximation solely from this equation. Secondly, and more importantly, the noise term $\sqrt{bu(1-u)}\eta(x,t)$ in SFKPP is purely local. This noise term is rigorous only for a 0 dimensional system, where the equivalence between the Fokker-Planck approximation and the stochastic differential equation can be shown. For a spatially extended system, the noise term should also include fluctuations arising from adjacent lattice cells. To our knowledge, however, a rigorous derivation has not yet been achieved (see Mathematical Details V.1). The problem of noise arising from adjacent cells was also noted by Korolev et al.Korolev2010 . Finally, in an evolutionary model, both the diffusion coefficient and the noise amplitude are the result of the same phenomenon of individuals replacing each other randomly and they should be linked through an Einstein like relation.

The aim of the present article is to study the dynamics of the front between mutant and wild type individuals directly from the individual based, stochastic Moran model of population genetics. For this model, the Master equation can be stated without ambiguity or approximation. We show that the mean field approximation of the Master equation gives rise to a partial differential equation that differs from the FKPP equation and its predictions at high selection pressure. Going beyond mean field, we then derive the exact equations for the evolution of the various moments of the front for a one dimensional system and solve it at small selection pressure. In this approach, the noise term is not restricted to be only local. We show, in agreement with Hallatschek2009 that even for a neutral model (i.e., $s=0$ ), the front is well defined and the displacement of the front can be mapped into a Brownian motion at large times, the convergence time to this state is shown to be in $1/\sqrt{t}$ . The front drift and its velocity can then be derived at small selection pressure by a perturbatiion approach where we can show, in contrast to the FKPP predictions, that the speed of the front is linear in the excess relative fitness $s$ . Finally, we show that the *effective *local population size which controls the noise amplitude, and the diffusion coefficient are both determined by the dispersal kernel of individuals and cannot be chosen as independent parameters.

This article is organized as follow. Section II is devoted to the generalization of the Moran model to population geographically structured into demes/islands, where the dynamics of the front can be deduced from the internal population dynamics of the islands and their exchanges. We demonstrate in subsection II.1 how an FKPP-like equation emerges from the mean field approximation of the Master equation and show how it differs from the classical FKPP equation. The following subsections of section II are devoted to full stochastic treatments of the dynamics of the front. Section III goes beyond the island model and considers general migration kernels between individuals that are no longer grouped into demes. Solving the Master equation of the model shows how the island size and migration number between neighboring islands of section II are related through the dispersal kernel. The approach allows for the determination of the effective population size and therefore the noise amplitude. The final section is devoted to discussion and conclusions.

II The island model and mutants propagation in 1 dimension.

The fundamental model of population genetics for non structured populationsEwens2004 was formulated by Fisher and WrightFisher1999 . A continuous time version, which is also more mathematically tractable, was proposed by MoranMoran1962 . The extension of the Moran model to geographically extended populations was formulated by KimuraKimura1964a and MaruyamaMaruyama1974 and in more recent terminology is referred to as evolutionary dynamics on graphs Lieberman2005 . The model is also widely used in ecology, specifically in the framework of the neutral theory of biodiversityHubbel2001 ; Houchmandzadeh2003 ; Vallade2003 .

In this model, populations, formed of wild type individuals with fitness 1 and mutants with fitness $1+s$ are structured into cells (or demes or islands) each containing $N$ individuals. When an individual dies in one island, it is immediately replaced by the progeny of another, therefore keeping the number of individuals in each island always equal to $N$ . The replacement probability is weighted by the fitness of the individuals; moreover, the progeny stems from a local parent with probability $(1-m)$ or a parent from a neighboring island with probability $m$ (figure 1). These three parameters, $N$ , $s$ and $m$ are the only ingredients of this generic model.

Let us consider an infinitely extended one-dimensional collection of islands and call $n_{i}$ the number of mutant individuals on island $i$ . We use the vector $\mathbf{n}$ as a shorthand notation for the collection of these numbers $\mathbf{n}=\{...,n_{i},...\}$ . The transition probability densities for the number of mutants on island $i$ to increase/decrease by one individual isHouchmandzadeh2011b :

[TABLE]

where $\mu$ is the death rate of individuals and

[TABLE]

The rate of increase (2) for example is the probability density per unit of time that one wild type individual dies ( $\mu(N-n_{i})$ ) multiplied by the probability that it is replaced by a mutant, either from a local parent ( $(1-m)n_{i}/N$ ) or a neighboring parent ( $(m/2N)(n_{i-1}+n_{i+1})$ ), and multiplied by the fitness of the mutant $(1+s)$ . The fitness can be seen as an increase in the death rate of the wild type individuals, or a higher replacement probability/decreased death rate for the mutants. The probability $P(\mathbf{n},t)$ of observing the state $\mathbf{n}$ at time $t$ obeys the Master equation

[TABLE]

where $a_{i}^{\dagger}\mathbf{n}$ and $a_{i}\mathbf{n}$ are shorthand notations for states $\{...,n_{i}\pm 1,...\}$ . Without loss of generality (see Mathematical Details V.4), the initial condition we use throughout this article is that of an initial sharp front

[TABLE]

II.1 Mean field approximation.

The mean field approximation for $\left\langle n_{i}\right\rangle$ , the average number of individuals on island $i$ , is obtained by neglecting fluctuations (*i.e., *by setting $\left\langle n_{i}n_{k}\right\rangle=\left\langle n_{i}\right\rangle\left\langle n_{k}\right\rangle$ )

[TABLE]

Taking the space continuum limit by setting $x=\ell i$ , $u(x)=n_{i}/N$ , we obtain the partial differential equation

[TABLE]

where the length $\ell$ is the spatial extension of an island and $D=\mu m\ell^{2}/2$ is the diffusion coefficient. We observe that the mean field equation of the spatial Moran model is different from the FKPP equation in the diffusion term. Fisher himself, in his original articleFisher1937 , had stressed that using a simple diffusion term is an oversimplification of basic population genetics processes. The modification of the diffusion term has important consequences both on the speed of the propagation front and on its width. The minimum speed of the propagating wave in the equation (7) is now (see Mathematical details V.3)

[TABLE]

which scales as $s$ for high value of the excess fitness, in contrast to the scaling in $\sqrt{s}$ in the FKPP equation. Numerical resolutions of eq.(7) (Figure 2) show that $v_{min}$ computed above is an excellent estimator of the speed of the front.

Furthermore, the width of the front does not scale as $\sqrt{D/\mu s}$ as in the case of the FKPP equation, but is well approximated by

[TABLE]

where $\kappa(s)$ is a small correction (see Mathematical details V.3). Specifically, for large $s$ , the width converges to a constant $(15/8)\sqrt{D/\mu}$ .

A phenomenological argument can be used to understand these modifications to the FKPP equation. It is well knownPanja2003 that the dynamics of a pulled front is governed by the behavior at small $u$ . In these regions, the mean field equation (7) can indeed be approximated by an FKPP equation, with the effective diffusion coefficient $D_{eff}=D(1+s)$ .

We observe that the spatial Moran model differs significantly from the prediction of FKPP equation at high fitness $s$ . We will show below that the same is true at low fitness. This difference was first noted by Hallatschek and KorolevHallatschek2009 in their study of the SFKPP equation.

II.2 Stochastic characterization of the front.

Let us now come back to the full stochastic treatment of the propagating front. The temporal evolution of local population moments can be extracted from the Master equation (5) (see Appendix V.2):

[TABLE]

The most important global quantities are the front displacement $U(t)$ and its width $B(t)$ . These global quantities can be measured in terms of local populations by

[TABLE]

The width $B(t)$ weights the region where the mutant population is different from either [math] or $N$ , and in the continuous limit, can be expressed as $B=\int_{I}u(1-u)dx$ where $u(x)=n_{i}/N$ . Note that these quantities are always finite, as the sums involve only a finite number of non-zero terms. We restrict this paper to the computation of the first moments of these quantities, namely $\left\langle U(t)\right\rangle$ , $\mbox{Var}(U(t))$ and $\left\langle B(t)\right\rangle$ , where $\left\langle\right\rangle$ stands for ensemble average. The computation of these quantities implies the computation of the second moments

[TABLE]

The width of the front is then

[TABLE]

and the variance of its displacement is

[TABLE]

For the neutral front ( $s=0$ ), self consistent, exact equations without any moment closure approximation can be derived directly for the global quantities. At small selection pressures $Ns\ll 1$ , they can be recovered through a first order perturbation analysis.

II.3 Behavior of the neutral front $s=0$ .

For a one dimensional system where mutants and wild type have the same fitness ( $s=0$ ), we will show that the front separating these two populations can be envisioned as a well defined object that performs a Brownian motion and whose width fluctuates around an equilibrium value: $\left\langle U(t)\right\rangle=0$ and for large times, $V(t)=mt$ and $\left\langle B(t)\right\rangle=m(N-1)/2$ .

To obtain the above quantities, we sum over local fluctuations

[TABLE]

There are different contributions to $d\left\langle n_{i}n_{i}\right\rangle/dt$ : one is the local demographic noise $2\left\langle n_{i}(N-n_{i})\right\rangle/N$ , which appears in the SFKPP equation ; the other term, $m\left\langle n_{i}n^{\prime\prime}_{i}\right\rangle/N$ , is the demographic noise due to adjacent cells and cannot a priori be neglected*.* In the extreme case where $N=1$ and therefore $m=1$ , the local demographic noise is exactly zero, but the stochasticity of the system remains the same, as we will see below.

From now on, we will measure time in generation time units, i.e., set $t\leftarrow\mu t$ . By summing over the first moments, we find trivially that the mean front position stays at its initial value

[TABLE]

The second moments on the other hand obey a set of linear differential equations

[TABLE]

where $\alpha=(1-m)/(Nm)$ , $\beta=1/N$ and the coefficients $C_{p}$ depend on the initial conditions :

[TABLE]

The parameters $\alpha$ and $\beta$ measure the relative contribution to the demographic noise of local versus adjacent cells . The parameter $\beta$ can be neglected with respect to $\alpha$ only in the limit of small migration probability $m\ll 1$ .

For an initially sharp front (eq. 6), $C_{p}=-\delta_{p,0}$ . We stress that we can assume this condition without loss of generality (see Appendix V.4)

The above system (15,16) can be solvedHouchmandzadeh2003 exactly. In the Laplace space where $\hat{Z}_{p}(\omega)=\int_{\mathbb{R}^{+}}\exp(-\omega t)Z_{p}(t)dt$ , the solution is particularly simple,

[TABLE]

where $\omega=(z+1/z)-2$ . By taking the inverse Laplace transform the exact solution of $Z_{p}(t)$ can be found as a combination of modified Bessel functionsHouchmandzadeh2003 . In this article, we are mostly concerned with the large time limit, which can be deduced from the expansion of $\hat{Z}_{p}(\omega)$ around $\omega=0$ :

[TABLE]

where $K=m(N-1)+2$ . The above approximation is valid for $t\gg p^{2}$ ; a uniform large time approximation for all $p$ can also be found in terms of combinations of erf functionsHouchmandzadeh2003 , but is not needed here.

As $B(0)=0$ and $\left\langle U(t)\right\rangle=0$ , eq.(13) implies that

[TABLE]

The front therefore reaches a finite width

[TABLE]

and the equilibrium value is reached as $1/\sqrt{mt}$ . Figure 3a shows the perfect agreement of these results with numerical simulations. The above equilibrium value of the width is also in agreement with the value found from the SFKPP equationHallatschek2009 $4Db^{-1}$ if the amplitude of the noise term is interpreted as $b=4\mu\ell/N$ . As noted by Hallatschek and KorolevHallatschek2009 , genetic drift alone can maintain a finite front width at $s=0$ in one dimension. Moreover, numerical simulations of the discrete model show that the width distribution probability of the front has an exponential tail (figure 3b)

Note that the width $B$ defined above as

[TABLE]

weights the regions with populations $0<n_{i}<N$ , but contains no information about their spatial distribution. A spatially wide front composed for example of alternating $n=0$ and $n=N$ islands will have $B=0$ .

The shape of the front can be characterized more precisely by using the moving frame of the front as the reference frame and computing the mean relative mutant number $\nu_{i}$ and their weight $\beta_{i}$ in this frame

[TABLE]

where $[U]$ is the integer part of the front displacement given by the relation (10). These quantities are difficult analytically but are readily computed by numerical simulation, as shown in Figure 4. As it can be observed, the mean front shape $\beta_{i}$ , which is a function of $N$ and $m$ (Figure 4a) is spatially extended and decreases slowly as a function of $i$ , the distance to the center of the front (Fig. 4c). The width $B_{\mbox{eq}}$ computed above remains however a good indicator of the mean front shape, and all $\beta_{i}$ curves can be superimposed when the normalized index $i/B_{\mbox{eq}}$ is used(Fig. 4b).

The variance of the position of the front $V(t)=\left\langle U^{2}(t)\right\rangle-\left\langle U(t)\right\rangle^{2}$ can be extracted by similar methods from equation (14) :

[TABLE]

As $Z_{p}(t)$ can be computed exactly, the temporal evolution of the variance can also be computed exactly. The result is particularly simple for large times $t\gg 1$

[TABLE]

The surprising result is that for large times, the diffusion of the front is independent of its width. The figure 5 shows the agreement of this expression with numerical solutions.

II.4 Behavior of the front at small $s$ .

For a non-zero excess relative fitness, the moment closure does not hold and the front characteristics can no longer be derived exactly. It is however possible to derive the front speed to the first order of the perturbation $s$ .

For $s>0$ the position of the front is given by

[TABLE]

At small selection pressures $Ns\ll 1$ , on expanding the above expression to the first order of perturbation, we find, in the limit of large time

[TABLE]

Note that at small $s$ , the front speed scales as the selection pressure $Ns$ . Even for $N=1$ when the front width $B_{eq}=0$ , the front acquires a non-zero speed (figure 6).

The above computation of the position of the front at $s>0$ , which is a first order moment, requires the knowledge of second order moments $Z_{i}$ at $s=0$ . The same line of argument shows that computing the variance of the position and width of the front for $s>0$ , which are second order moments, necessitates the computation of third order statistical quantities. Even though computation of higher momenta is theoretically possible in the neutral case $s=0$ , their effective computation remains extremely tedious.

Figure 6 shows the result of stochastic based numerical simulations for a wide range of $s$ and the agreement with expression (22) at small selection pressure. It can be observed that the mean field approximation becomes correct only at very high excess relative fitness $s$ and local population size $N$ . Fluctuations modify significantly the prediction of the FKPP model.

III Microscopic model and general migration kernel.

In the SFKPP description of the mutant wave (eq.1) the diffusion coefficient $D$ and the amplitude of the noise $b$ are considered independent parameters. The same is true for the island model of the preceding section where the population size of the island $N$ and the migration probability $m$ between neighboring islands were considered independent. However, at the individual level of evolution, both migration and genetic drift are the result of the same phenomenon of individuals replacing each other. These two parameters must therefore be linked through an Einstein-like relation and cannot be independent.

There is a level of arbitrariness in the island model in the manner in which individuals are grouped together and* *deme size $N$ is chosen. As the amount of fluctuations is critically controlled by $N$ , the grouping process is crucial. This arbitrariness also impacts on the migration probability. The very existence of a unique migration rate between nearest neighbor islands can be brought into question. Consider for example the low migration limit ( $m\ll 1$ ) of the island model: two individuals physically far apart from each other but grouped into the same island will have a higher probability of replacing each other than two close individuals belonging to neighboring islands.

A more rigorous approach to this problem would be to consider a “microscopic” model where individuals are uniformly distributed in space and not arbitrarily grouped into demes/islands. In this microscopic model, migration/replacement is not restricted to nearest neighbors (figure 7a): when an individual dies at site $i$ , it has a probability $m_{i}^{j}$ of being replaced by the progeny of an individual at site $j$ . Solving exactly this strictly individual model then leads us to choose the *effective *population size of each deme, which is used in the stepping stone approach (figure 7b) and derive the exact relation between the diffusion coefficient and the noise amplitude, both of which are a function of the dispersal kernel $m_{i}^{j}$ .

In this evolutionary graph approach, each site contains exactly one individual (either wild type or mutant) $n_{i}=0,1$ ; the transition probability densities for the number of mutants on site $i$ to increase/decrease by one individual is a simple generalization of equations (2,3) . Thus:

[TABLE]

where $m_{i}^{j}$ is the probability that the progeny of an individual at site $j$ replaces an individual at site $i$ . In the literature of plants, the migration probability $m_{i}^{j}$ is known as the dispersal kernel and can be measured precisely in the fieldNathan2000 . In the following, we will consider dispersal kernels that depend only on the distance between two sites, $i.e.$ , $m_{i}^{j}=m(|j-i|)$ .

III.1 Mean field approximation.

Following the same steps as in subsection II.1, it is straightforward to deduce the mean field approximation of the corresponding master equation. For a migration probability that depends only on the distance between two sites, the mean field approximation is exactly the same as expression (7), where $\ell$ here is the inter-individual distance (lattice size), $\mu$ the death rate and the diffusion coefficient

[TABLE]

is given in terms of mean dispersal distance.

From now on and to avoid confusion, we will refer to all quantities derived in the island approximation (the macroscopic view) of section II by the super script ⋆ . The diffusion coefficient of the mean field approximation derived in subsection II.1 (relation 7), for example, is

[TABLE]

For a 1d system, the patch extension $\ell^{\star}=N^{\star}\ell$ (figure 7b) ; comparing expression (25) and (26) therefore leads to

[TABLE]

We observe here that the deme size $N^{\star}$ and the migration probability between demes $m^{\star}$ are indeed linked through equation (27).

III.2 Stochastic characterization of the neutral front at $s=0$ .

The relation (27) is not sufficient to determine the effective population size $N^{\star}$ of islands. To address this issue, we need to solve exactly the full stochastic model. We restrict the computation here to the neutral case $s=0$ , the derivations for $s\neq 0$ following precisely the steps developed previously.

The computational approach is similar to subsection II.2. As before, the displacement of the front is defined as

[TABLE]

and

[TABLE]

Therefore, for the neutral front $s=0$ , $d_{t}\left\langle U\right\rangle=0$ and $\left\langle U(t)\right\rangle=0$ .

The second order moments are also defined as before

[TABLE]

and we note that $Z_{p}=Z_{-p}.$ The equations governing $Z_{p}$ for the rescaled time $t\leftarrow\mu t$ are

[TABLE]

where the $C_{p}$ are defined by the initial condition

[TABLE]

Note that equation (30) implies that $Z_{0}(t)=0.$ This is due to the fact that $n_{i}^{2}=n_{i}$ and therefore $Z_{0}(t)=\left\langle U(t)\right\rangle$ .

For an initially sharp front

[TABLE]

which will be used here,

[TABLE]

For simplicity, we further restrict the solution of equations (30-31) to the generic geometric dispersal kernel

[TABLE]

where the parameter $\lambda$ controls the dispersal length $\kappa$ :

[TABLE]

The case $N^{\star}=1$ , $m^{\star}=1$ of the preceding section is obtained when $\lambda\rightarrow 0$ . Note that in the framework of the Moran model used here, a dead individual cannot replace itself, hence $m(0)=0$ . Moreover, for the geometric dispersal kernel, relation (33) becomes

[TABLE]

It is straightforward to check that again all $Z_{p}$ converge as $\propto 1/\sqrt{t}$ to the same value

[TABLE]

where $\kappa^{2}$ is defined in (35) and $\theta=2\sum_{k>0}k\,m(k)$$=1/(1-\lambda)$ . We observe that the stationary value of $-Z_{p}$ is given by a quantity similar to the variance of the dispersal kernel. Figure 8 shows the agreement between these results and individual based numerical simulation of the same system.

The width of the front can no longer be measured as in relation (11) by $B=\sum_{i}n_{i}(1-n_{i})$ which is always [math]. Other analog metrics such as

[TABLE]

can be used to characterize the front. It is straightforward to show that

[TABLE]

Figure 9 shows the excellent agreement between the theoretical results and the numerical simulations.

The variance of the front displacement $V=\left\langle U^{2}\right\rangle-\left\langle U\right\rangle^{2}$ can be computed by methods analogous to the previous section:

[TABLE]

where the $W_{i}^{\pm}$ are the transition rates (23,24) and $C_{0}$ is defined by relation 33). For the geometric kernel $m(k)$ (eq. 34), using the long term solution (36) leads to

[TABLE]

III.3 Grouping into islands.

We now group individuals virtually into islands of size $N^{\star}$ and establish the condition under which the results obtained by the SFKPP/islands model are valid.

A patch of population size $N^{\star}$ is a virtual packing of individuals into a deme, which we refer to by its index $q$ (Figure 7b). The number of mutants in patch $q$ is

[TABLE]

where $b_{q}=a_{q}+N^{\star}-1$ and $a_{q+1}=b_{q}+1$ . Note that we have $N^{\star}$ possible choices for grouping individuals, as we can set $a_{q}=N^{\star}q+r$ , where $r=0$ , 1, …, $N^{\star}-1$ . We define the displacement and the width of the (macroscopic) front as in subsection II.2 (definitions 10,11)

[TABLE]

We compute the statistical properties of these quantities by taking into account the detailed migration kernel (subsection III.2) and compare them to the results obtained in subsection II.2 where migrations were approximated by a single migration probability $m^{\star}$ between neighboring demes.

The macroscopic displacement $U^{\star}$ (relation 41) is easily related to the microscopic displacement $U$ (relation 28) :

[TABLE]

and therefore $V^{\star}=\text{Var}(U^{\star})=\text{Var}(U)/N^{\star 2}$ . The variance of the microscopic displacement $V$ is given by relation (39), and hence, for long times,

[TABLE]

Comparing this expression to the relation $V^{\star}=m^{\star}t$ of the island model (section II) where $m^{\star}$ is the migration probability between demes, we see that we must have

[TABLE]

The above relation is a confirmation of relation (27), which we obtained by a mean field approximation. We can also compute, at small selection pressure $Ns$ , the speed of the front that we find to be $c=s\kappa^{2}/2$ . Comparing this result to the speed $c^{\star}=c/N^{\star}=sN^{\star}m^{\star}/2$ (eq. 22) of the island model leads again to the same relation between $m^{\star}$ and $N^{\star}$ as relation (44).

The macroscopic width of the front (relation 42) can also be computed in terms of microscopic quantities (see Appendix V.5)

[TABLE]

where $Y_{p}$ are the microscopic front width (defined by relation 37). For large times, relations (38) and (36) lead to

[TABLE]

The above relation is in perfect agreement with numerical simulations. Comparing the above relation to the width of the front $B^{\star}=m^{\star}(N^{\star}-1)/2$ of the island model (relation 18) and using relation (44) for $m^{\star}$ , we find that we must have

[TABLE]

The above result determines the effective population size in terms of the dispersal kernel. More precisely,

[TABLE]

We note that $m^{\star}$ is weakly dependent on $\lambda$ over its whole range of variation [0, 1], whereas $N^{\star}$ diverges as the dispersion characteristic length when $\lambda\rightarrow 1$ .

IV Discussion and conclusion.

In this article, we have used the formalism of the spatial Moran model to study the propagation of mutants in a one dimensional geographically extended population. The propagation of the mutant wave has usually been studied in the framework of the SFKPP equation (1). The SFKPP equation however is a phenomenological approach and its derivation from the fundamental models of population genetics such as Wright-Fisher or Moran is not obvious. The deterministic Fisher equation for the dynamics of the proportion $u(t)$ of a mutant in a non-structured population is $d_{t}u=su(1-u)$ . For geographically structured populations, it seemed naturalFisher1937 to add a spatial diffusion term and generalize simply this equation to $\partial_{t}u=D\Delta u+su(1-u)$ , where $u(x,t)$ is the *local *proportion of the mutant. On the other hand, since the time of Fisher and Wright, it was obvious that genetic drift is an important factor at small selection pressure. For large non-structured ( $d=0)$ populations, Kimura tackled this problem by using a Fokker-Planck approximation of the Master equations governing the WF or Moran models. The stochastic differential equation associated with the Kimura equation is $d_{t}u=su(1-u)+\sqrt{u(1-u)/N}\eta(t)$ . It then seemed natural to unite the two approaches and propose the SFKPP equation (1).

We see here that many assumptions were made in this process : (i) the form of the diffusion term may be different ; (ii) $u(x,t)$ is a local relative density and the noise term of SFKPP would be a good approximation only if the number of individuals in each patch where $u$ has been computed is large enough ; (iii) the noise term $\sqrt{u(1-u)/N}$ itself was obtained for a non-structured population and it is far from obvious that it should be the same for an extended population and not involve the spatial derivative of $u$ .

The individual based approached that we develop in this article is intended to overcome these problems and to ground the SFKPP approach on a firmer basis. Using an explicit spatial island model, we have shown first that the diffusion term is indeed different from the FKPP equation (relation 7) and this difference has important consequences on the speed and width of the front for large selection pressures (relation 8,9).

For small selection pressures, we derive the parameters of the front (speed, diffusion coefficient and width) without any assumption on the size of the local population and without neglecting the non-local noise. These results are in agreement with the predictions of SFKPP equation at small selection pressure as developed by Hallatschek2011 .

Finally, by taking into account the explicit migration kernel, we establish the relation between the amplitude of the diffusion and that of the noise ; this approach also allows us to define the effective size of the local population $N$ , which is the crucial parameter controlling the noise as a function of the dispersal length.

Individual based models have the same level of complexity as their equivalent stochastic differential equation approach. We believe that the formalism developed in this article is a step forward in the search for a better understanding of natural populations and the dynamics of mutant waves.

Acknowledgements.

We are grateful to Erik Geissler for the critical reading of the manuscript and fruitful discussions.

V Mathematical details.

V.1 The noise term in SFKPP

The argument for the phenomenological noise term used by Doering et al. can be rephrased as follows in the framework of population genetics. For a non-structured population (a population at $d=0)$ , in an ecosystem with carrying capacity of $N$ individuals formed of wild type individuals with fitness $1$ and mutants with fitness $1+s$ , the transition rates for the one-step Moran process is Houchmandzadeh2010

[TABLE]

where $n$ is the number of mutants. The probability $P(n,t)$ of observing $n$ mutants at time $t$ is governed by the Master equation associated with these rates

[TABLE]

For a large ecosystem ( $N\gg 1$ ) at small selection pressure $(Ns\ll 1$ ), the above Master equation can be approximated by a Fokker-Plank equation called the Kimura equationKimura1962 ; Ewens2004 ; Houchmandzadeh2010

[TABLE]

where $u=n/N$ and time is measured in units of $1/\mu$ . The above diffusion equation is equivalent to a stochastic differential equation for the density $u$ Gardiner2004 :

[TABLE]

The origin of the noise is the genetic drift due to the size of the system.

For a spatially extended system, the Doering et al. phenomenological approach to derive the SFKPP consists in adding a (spatial) diffusion term $D\nabla^{2}u$ to the stochastic equation 47, but conserving the same local noise term, and neglecting fluctuations from adjacent cells.

V.2 Moment computation algebra.

The rules of moment computations are fairly standard (see for example Pechenik1999 ), but we give them here for self-consistency. Various moments can be extracted directly from the Master Equation

[TABLE]

by multiplying it by some operator and then making the change of variable $\mathbf{n}\rightarrow a_{k}\mathbf{n}$ or $a_{k}^{\dagger}\mathbf{n}$ . Consider for example

[TABLE]

After replacing $\partial_{t}P$ by its value from equation 48, the first term on the r.h.s. of the above equation reads:

[TABLE]

Changing the variable $\mathbf{n}\rightarrow a_{k}^{\dagger}\mathbf{n}$ implies $n_{i}\rightarrow n_{i}+\delta_{i,k}$ and $n_{j}\rightarrow n_{j}+\delta_{j,k}$ and

[TABLE]

Computing now the second term and grouping all the terms, we get

[TABLE]

For the neutral case $s=0$ we have

[TABLE]

and

[TABLE]

So finally

[TABLE]

V.3 Front speed and width in the mean field approximation.

The minimum speed of propagating front of the equation

[TABLE]

is obtained by following the original FisherFisher1937 approach. For a propagating front, on setting $\partial_{t}u=-v\partial_{x}u$ and then setting $g=-du/dx$ , we get the equation for $g(u)$ :

[TABLE]

Setting $p=dg/du|_{u=0}$ as the slope of the curve at the origin $u=0$ , the equation for $p$ is

[TABLE]

which has a solution only for

[TABLE]

The width $B=\int_{\mathbb{R}}u(1-u)dx$ of the front can be computed following the same approach. Setting $v=\partial_{t}\int_{\mathbb{R}}udx$ , exchanging the derivation on $t$ and integration on $x$ , and performing integration by parts on the propagating front, we get

[TABLE]

The shape of the front $g(u)$ is not known. However, $g(u)$ is a smooth function, $g(0)=g(1)=0$ and its slope at both ends, $p=dg/du|_{u=0}$ and $q=dg/du|_{u=1}$ are known and determined from equation (50). Approximating then $g(u)$ by a third order polynomial that respects these constraints

[TABLE]

leads to:

[TABLE]

where

[TABLE]

V.4 Choice of initial conditions.

The definition of various moments we use in this article such as relations (10-12) ensures that the infinite sums over sites contain only a finite number of non-zero terms; it avoids the problem of spurious effects due to manipulation of divergent series. However, the initial front may not need to be sharp, but only finite.

Consider the discrete function

[TABLE]

where the position [math] corresponds the to middle of the initial front. We can redefine the moments as

[TABLE]

The differential equations we derived throughout this article remain invariant under this definition of the moments, the only difference being that the initial values of these moments are non-zero.

V.5 Relation between microscopic and macroscopic front width.

By definition,

[TABLE]

where $n^{\star}$ is defined by relation (40), $b_{q}=a_{q}+N^{\star}-1$ and $a_{q+1}=b_{q}+1$ . Note that we have $N^{\star}$ possible choices for grouping individuals, as we can set $a_{q}=N^{\star}q+r$ , where $r=0$ , 1, …, $N^{\star}-1$ . As $n_{i}=0,1,$ the first term in the above sum is zero and $n_{i}^{2}=n_{i}$ . Rearranging the indices in the second term, we have

[TABLE]

Note that for an unrestricted sum

[TABLE]

However, the problem with expression (52) is that we are missing the terms $n_{i}n_{i+k}$ , where $i$ is in one cell and $i+k$ in another one. In fact, for each cell, we are missing $k$ terms of the form $n_{i}n_{i+k}$ connecting two neighboring cells. We now use our freedom to choose the grouping $r$ : we use $N^{\star}$ different choices of $r$ and sum all of them. A term missing in one choice of $r$ will be recovered in another. As the result must not depend on the choice of $r$ , we have

[TABLE]

The index manipulation is clearer when performed manually on a few simple examples such as $N^{\star}=2$ or 3.

V.6 Numerical simulations.

All numerical simulations are written in C++, and data analysis is performed by the high level language JuliaBezanson2014 . Numerical simulation of the island model (section II) is performed by a Gillespie algorithm by computing the jump probabilities from the transition rates (2,3). For the generalized migration kernel of section III, the Gillespie approach is too cumbersome and a direct approach has been used: the index of an individual is chosen at random and it is replaced by the value of another individual chosen according to the kernel $m(k)$ .

Bibliography31

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] M Kimura. On the probability of fixation of mutant genes in a population. Genetics , 47:713–719, 1962.
2[2] B Houchmandzadeh and M Vallade. Alternative to the diffusion equation in population genetics. Phys Rev E Stat Nonlin Soft Matter Phys , 82(5 Pt 1):51913, 2010.
3[3] Erez Lieberman, Christoph Hauert, and Martin A Nowak. Evolutionary dynamics on graphs. Nature , 433(7023):312–316, 2005.
4[4] Bahram Houchmandzadeh and Marcel Vallade. The fixation probability of a beneficial mutation in a geographically structured population. New Journal of Physics , 13(7):073020, jul 2011.
5[5] R.A. Fisher. The wave of advance of advantageous genes. Annals of Eugenics , 7:355–369, 1937.
6[6] A. N. Kolmogorov. A study of diffusion equation with increase in the amount of substance. In Selected work of N.A. Kolmogorov , chapter 38, pages 242–271. Kluwer Academic Publishers, 1991.
7[7] James D. Murray. Mathematical Biology: I. An Introduction (Interdisciplinary Applied Mathematics) (Pt. 1) . Springer, 2007.
8[8] H.P. Beuer, W. Huber, and F. Petruccione. The Macroscopic Limit in a Stochastic Reaction-Diffusion Process. Europhysics Letter , 30(April):69–74, 1995.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Fisher Waves: an individual based stochastic model.

Abstract

I Introduction.

II The island model and mutants propagation in 1 dimension.

II.1 Mean field approximation.

II.2 Stochastic characterization of the front.

II.3 Behavior of the neutral front s=0s=0s=0.

II.4 Behavior of the front at small sss.

III Microscopic model and general migration kernel.

III.1 Mean field approximation.

III.2 Stochastic characterization of the neutral front at s=0s=0s=0.

III.3 Grouping into islands.

IV Discussion and conclusion.

Acknowledgements.

V Mathematical details.

V.1 The noise term in SFKPP

V.2 Moment computation algebra.

V.3 Front speed and width in the mean field approximation.

V.4 Choice of initial conditions.

V.5 Relation between microscopic and macroscopic front width.

V.6 Numerical simulations.

II.3 Behavior of the neutral front $s=0$ .

II.4 Behavior of the front at small $s$ .

III.2 Stochastic characterization of the neutral front at $s=0$ .