Well posedness and Maximum Entropy Approximation for the Dynamics of   Quantitative Traits

Katarina Bodova; Jan Haskovec; Peter Markowich

arXiv:1704.08757·math.AP·August 1, 2018

Well posedness and Maximum Entropy Approximation for the Dynamics of Quantitative Traits

Katarina Bodova, Jan Haskovec, Peter Markowich

PDF

TL;DR

This paper analyzes the well-posedness of a degenerate Fokker-Planck equation modeling quantitative trait dynamics, proves exponential convergence to equilibrium, and introduces a modified maximum entropy method for moment approximation with improved performance.

Contribution

It establishes well-posedness and spectral gap results for a degenerate Fokker-Planck equation and develops a modified maximum entropy method applicable across all parameters.

Findings

01

Proved existence and uniqueness of solutions for the Fokker-Planck equation.

02

Established exponential convergence to equilibrium under certain conditions.

03

Demonstrated improved approximation performance of the modified DynMaxEnt method.

Abstract

We study the Fokker-Planck equation derived in the large system limit of the Markovian process describing the dynamics of quantitative traits. The Fokker-Planck equation is posed on a bounded domain and its transport and diffusion coefficients vanish on the domain's boundary. We first argue that, despite this degeneracy, the standard no-flux boundary condition is valid. We derive the weak formulation of the problem and prove the existence and uniqueness of its solutions by constructing the corresponding contraction semigroup on a suitable function space. Then, we prove that for the parameter regime with high enough mutation rate the problem exhibits a positive spectral gap, which implies exponential convergence to equilibrium. Next, we provide a simple derivation of the so-called Dynamic Maximum Entropy (DynMaxEnt) method for approximation of moments of the Fokker-Planck solution,…

Figures17

Click any figure to enlarge with its caption.

Tables4

Table 1. Table 1. Relative errors of approximation ( 6.1 ) of the moment ⟨ ln ⁡ ( ξ ) ⟩ u ( t ) subscript delimited-⟨⟩ 𝜉 𝑢 𝑡 \left\langle\ln(\xi)\right\rangle_{u(t)} by the original DynMaxEnt method ( 5.3 ), first row, and its modified version ( 5.5 ), second row. The corresponding plots are given in Fig. 2 .

	$α = 1.1$	$α = 1.5$	$α = 2.5$	$α = 3.0$
method (5.3)	$1.45 \times 10^{- 2}$	$4.09 \times 10^{- 3}$	$1.42 \times 10^{- 3}$	$1.37 \times 10^{- 3}$
method (5.5)	$3.78 \times 10^{- 3}$	$1.30 \times 10^{- 3}$	$3.65 \times 10^{- 4}$	$3.41 \times 10^{- 4}$

Table 2. Table 2. Relative errors of approximation ( 6.1 ) of the moment ⟨ ln ⁡ ( ξ ) ⟩ u ( t ) subscript delimited-⟨⟩ 𝜉 𝑢 𝑡 \left\langle\ln(\xi)\right\rangle_{u(t)} by the modified DynMaxEnt method ( 5.5 ). The corresponding plots are given in Fig. 3 .

	$α = 0.7$	$α = 0.5$	$α = 0.3$	$α = 0.2$
method (5.5)	$1.42 \times 10^{- 2}$	$2.81 \times 10^{- 2}$	$5.79 \times 10^{- 2}$	$8.88 \times 10^{- 2}$

Table 3. Table 3. Relative errors of approximation ( 6.1 ) of the moments ⟨ ln ⁡ ( ξ ) ⟩ delimited-⟨⟩ 𝜉 \left\langle\ln(\xi)\right\rangle , ⟨ ξ ⟩ delimited-⟨⟩ 𝜉 \left\langle\xi\right\rangle and ⟨ ξ ′ ⟩ delimited-⟨⟩ superscript 𝜉 ′ \left\langle\xi^{\prime}\right\rangle by the original DynMaxEnt method ( 5.2 ), first row, and its modified version ( 6.2 ), second row. The initial and target parameters are 4 μ 0 = 2 4 subscript 𝜇 0 2 4\mu_{0}=2 , η 0 = − 1 subscript 𝜂 0 1 \eta_{0}=-1 , γ 0 = 2 subscript 𝛾 0 2 \gamma_{0}=2 , 4 μ = 1.1 4 𝜇 1.1 4\mu=1.1 , η = 1 𝜂 1 \eta=1 , γ = 0 𝛾 0 \gamma=0 . The corresponding plots are given in Fig. 4 .

	$⟨ \ln (ξ) ⟩$	$⟨ ξ ⟩$	$⟨ ξ^{'} ⟩$
method (5.2)	$9.24 \times 10^{- 3}$	$1.30 \times 10^{- 2}$	$5.03 \times 10^{- 2}$
method (6.2)	$6.79 \times 10^{- 3}$	$1.01 \times 10^{- 2}$	$4.14 \times 10^{- 2}$

Table 4. Table 4. Relative errors of approximation ( 6.1 ) of the moments ⟨ ln ⁡ ( ξ ) ⟩ delimited-⟨⟩ 𝜉 \left\langle\ln(\xi)\right\rangle , ⟨ ξ ⟩ delimited-⟨⟩ 𝜉 \left\langle\xi\right\rangle and ⟨ ξ ′ ⟩ delimited-⟨⟩ superscript 𝜉 ′ \left\langle\xi^{\prime}\right\rangle by the modified DynMaxEnt method ( 6.2 ). The initial and target parameters are 4 μ 0 = 2 4 subscript 𝜇 0 2 4\mu_{0}=2 , η 0 = − 1 subscript 𝜂 0 1 \eta_{0}=-1 , γ 0 = 2 subscript 𝛾 0 2 \gamma_{0}=2 , 4 μ = 0.5 4 𝜇 0.5 4\mu=0.5 , η = 1 𝜂 1 \eta=1 , γ = 0 𝛾 0 \gamma=0 . The corresponding plots are given in Fig. 5 .

	$⟨ \ln (ξ) ⟩$	$⟨ ξ ⟩$	$⟨ ξ^{'} ⟩$
method (6.2)	$2.45 \times 10^{- 2}$	$2.55 \times 10^{- 2}$	$1.24 \times 10^{- 1}$

Equations188

\frac{\partial u}{\partial t} = - \frac{1}{2} i = 1 \sum L \frac{\partial}{\partial x _{i}} (ξ_{i} \frac{\partial ( α \cdot A )}{\partial x _{i}} u) + \frac{1}{4 N} i = 1 \sum L \frac{\partial ^{2}}{\partial x _{i}^{2}} (ξ_{i} u),

\frac{\partial u}{\partial t} = - \frac{1}{2} i = 1 \sum L \frac{\partial}{\partial x _{i}} (ξ_{i} \frac{\partial ( α \cdot A )}{\partial x _{i}} u) + \frac{1}{4 N} i = 1 \sum L \frac{\partial ^{2}}{\partial x _{i}^{2}} (ξ_{i} u),

A = (ξ_{1}^{'}, \dots, ξ_{L}^{'}, ξ_{1}, \dots, ξ_{L}, ln ξ_{1}, \dots, ln ξ_{L})

A = (ξ_{1}^{'}, \dots, ξ_{L}^{'}, ξ_{1}, \dots, ξ_{L}, ln ξ_{1}, \dots, ln ξ_{L})

α \cdot A = - β i = 1 \sum L γ_{i} ξ_{i}^{'} + 2 h i = 1 \sum L η_{i} ξ_{i} + 2 μ i = 1 \sum L ln ξ_{i},

α \cdot A = - β i = 1 \sum L γ_{i} ξ_{i}^{'} + 2 h i = 1 \sum L η_{i} ξ_{i} + 2 μ i = 1 \sum L ln ξ_{i},

α = (- γ_{1}, \dots, - γ_{L}, 2 η_{1}, \dots, 2 η_{L}, 2 μ, \dots, 2 μ) \in R^{3 L} .

α = (- γ_{1}, \dots, - γ_{L}, 2 η_{1}, \dots, 2 η_{L}, 2 μ, \dots, 2 μ) \in R^{3 L} .

u_{α} = \frac{1}{Z _{α}} \frac{exp ( 2 N α \cdot A )}{\prod _{i = 1}^{L} ξ _{i}},

u_{α} = \frac{1}{Z _{α}} \frac{exp ( 2 N α \cdot A )}{\prod _{i = 1}^{L} ξ _{i}},

Z_{α} := \int_{Ω_{x}} \frac{exp ( 2 N α \cdot A )}{\prod _{i = 1}^{L} ξ _{i}} d x .

Z_{α} := \int_{Ω_{x}} \frac{exp ( 2 N α \cdot A )}{\prod _{i = 1}^{L} ξ _{i}} d x .

\frac{\partial u}{\partial t} = \nabla_{x} \cdot (D u_{α} \nabla_{x} (\frac{u}{u _{α}}))

\frac{\partial u}{\partial t} = \nabla_{x} \cdot (D u_{α} \nabla_{x} (\frac{u}{u _{α}}))

\partial_{x} (D u_{α} \partial_{x} (\frac{u}{u _{α}})) = f

\partial_{x} (D u_{α} \partial_{x} (\frac{u}{u _{α}})) = f

u_{α} = Z_{α}^{- 1} ξ^{- 1} exp (2 N α \cdot A) = Z_{α}^{- 1} ξ^{4 N μ - 1} exp (2 N γ ξ^{'} + 4 N η ξ),

u_{α} = Z_{α}^{- 1} ξ^{- 1} exp (2 N α \cdot A) = Z_{α}^{- 1} ξ^{4 N μ - 1} exp (2 N γ ξ^{'} + 4 N η ξ),

D u_{α} \partial_{x} (\frac{u}{u _{α}}) = 0 \mbox f or x \in {0, 1} .

D u_{α} \partial_{x} (\frac{u}{u _{α}}) = 0 \mbox f or x \in {0, 1} .

D u_{α} \partial_{x} (\frac{u}{u _{α}}) = \int_{1/2}^{x} f (s) d s + C_{1},

D u_{α} \partial_{x} (\frac{u}{u _{α}}) = \int_{1/2}^{x} f (s) d s + C_{1},

C_{1} = \int_{0}^{1/2} f (s) d s .

C_{1} = \int_{0}^{1/2} f (s) d s .

u = C_{2} u_{α} + C_{1} u_{α} \int_{1/2}^{x} \frac{d s}{D ( s ) u _{α} ( s )} + u_{α} \int_{1/2}^{x} \frac{F ( s ) d s}{D ( s ) u _{α} ( s )},

u = C_{2} u_{α} + C_{1} u_{α} \int_{1/2}^{x} \frac{d s}{D ( s ) u _{α} ( s )} + u_{α} \int_{1/2}^{x} \frac{F ( s ) d s}{D ( s ) u _{α} ( s )},

\int_{1/2}^{x} \frac{d s}{D ( s ) u _{α} ( s )} \approx ξ^{- 4 N μ + 1} \mbox c l ose t o x \in {0, 1},

\int_{1/2}^{x} \frac{d s}{D ( s ) u _{α} ( s )} \approx ξ^{- 4 N μ + 1} \mbox c l ose t o x \in {0, 1},

D u_{α} \nabla_{x} (\frac{u}{u _{α}}) \cdot ν = 0 \mbox a . e . o n \partial Ω_{x},

D u_{α} \nabla_{x} (\frac{u}{u _{α}}) \cdot ν = 0 \mbox a . e . o n \partial Ω_{x},

u (t = 0) = u_{0} \mbox o n Ω_{x} .

u (t = 0) = u_{0} \mbox o n Ω_{x} .

B (x) := i = 1 \sum L ln ξ (x_{i}) - 2 N α \cdot A (x),

B (x) := i = 1 \sum L ln ξ (x_{i}) - 2 N α \cdot A (x),

\displaystyle\frac{\partial u}{\partial t}=\nabla_{\bf x}\cdot\bigl{(}D(\nabla_{\bf x}u+u\nabla_{\bf x}\mathcal{B})\bigr{)}

\displaystyle\frac{\partial u}{\partial t}=\nabla_{\bf x}\cdot\bigl{(}D(\nabla_{\bf x}u+u\nabla_{\bf x}\mathcal{B})\bigr{)}

y_{i} := y (x_{i}) := 2 N \int_{0}^{x_{i}} \frac{d s}{ξ ( s )} = 4 N arcsin x_{i},

y_{i} := y (x_{i}) := 2 N \int_{0}^{x_{i}} \frac{d s}{ξ ( s )} = 4 N arcsin x_{i},

\overline{u} (y) := J (x (y)) u (x (y)), J (x) := (2 N)^{- L} j = 1 \prod L ξ^{1/2} (x_{i})

\overline{u} (y) := J (x (y)) u (x (y)), J (x) := (2 N)^{- L} j = 1 \prod L ξ^{1/2} (x_{i})

\frac{\partial u}{\partial t} = \nabla_{y} \cdot (\nabla_{y} \overline{u} + \overline{u} \nabla_{y} (\overline{B} - ln \overline{J})),

\frac{\partial u}{\partial t} = \nabla_{y} \cdot (\nabla_{y} \overline{u} + \overline{u} \nabla_{y} (\overline{B} - ln \overline{J})),

\frac{1}{2 N} i = 1 \sum L \overline{J}^{- 1} ξ_{i} (\partial_{y_{i}} \overline{u} + \overline{u} \partial_{y_{i}} (\overline{B} - ln \overline{J})) ν_{i} = 0 \mbox a . e . o n \partial Ω_{y},

\frac{1}{2 N} i = 1 \sum L \overline{J}^{- 1} ξ_{i} (\partial_{y_{i}} \overline{u} + \overline{u} \partial_{y_{i}} (\overline{B} - ln \overline{J})) ν_{i} = 0 \mbox a . e . o n \partial Ω_{y},

(\nabla_{y} \overline{u} + \overline{u} \nabla_{y} (\overline{B} - ln \overline{J})) \cdot ν = 0 \mbox a . e . o n \partial Ω_{y},

(\nabla_{y} \overline{u} + \overline{u} \nabla_{y} (\overline{B} - ln \overline{J})) \cdot ν = 0 \mbox a . e . o n \partial Ω_{y},

\overline{u}_{α} := \overline{Z}_{α}^{- 1} exp (- (\overline{B} - ln \overline{J})), \overline{Z}_{α} := \int_{Ω_{y}} exp (- (\overline{B} - ln \overline{J})) d y .

\overline{u}_{α} := \overline{Z}_{α}^{- 1} exp (- (\overline{B} - ln \overline{J})), \overline{Z}_{α} := \int_{Ω_{y}} exp (- (\overline{B} - ln \overline{J})) d y .

z (y) := \overline{u} (y) / \overline{u}_{α} (y),

z (y) := \overline{u} (y) / \overline{u}_{α} (y),

\frac{\partial z}{\partial t} = Δ_{y} z - V (y) z,

\frac{\partial z}{\partial t} = Δ_{y} z - V (y) z,

V (y) = \frac{Δ _{y} u _{α}}{u _{α}} = \frac{1}{2} \frac{Δ _{y} u _{α}}{u _{α}} - \frac{1}{4} \frac{∣ \nabla _{y} u _{α} ∣ ^{2}}{u _{α}^{2}},

V (y) = \frac{Δ _{y} u _{α}}{u _{α}} = \frac{1}{2} \frac{Δ _{y} u _{α}}{u _{α}} - \frac{1}{4} \frac{∣ \nabla _{y} u _{α} ∣ ^{2}}{u _{α}^{2}},

V (y) = - \frac{1}{2} Δ_{y} (\overline{B} - ln \overline{J}) + \frac{1}{4} ∣ \nabla_{y} (\overline{B} - ln \overline{J})) ∣^{2} .

V (y) = - \frac{1}{2} Δ_{y} (\overline{B} - ln \overline{J}) + \frac{1}{4} ∣ \nabla_{y} (\overline{B} - ln \overline{J})) ∣^{2} .

\overline{u}_{α} (\nabla_{y} z + \frac{1}{2} z \nabla_{y} (\overline{B} - ln \overline{J})) \cdot ν = 0 \mbox a . e . o n \partial Ω_{y} .

\overline{u}_{α} (\nabla_{y} z + \frac{1}{2} z \nabla_{y} (\overline{B} - ln \overline{J})) \cdot ν = 0 \mbox a . e . o n \partial Ω_{y} .

V = \frac{1}{16 N} (4 N μ - \frac{1}{2}) (4 N μ - \frac{3}{2}) \frac{( ξ _{i}^{'} ) ^{2}}{ξ _{i}} + \mbox (b o u n d e d t er m s),

V = \frac{1}{16 N} (4 N μ - \frac{1}{2}) (4 N μ - \frac{3}{2}) \frac{( ξ _{i}^{'} ) ^{2}}{ξ _{i}} + \mbox (b o u n d e d t er m s),

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Well posedness and Maximum Entropy Approximation for the Dynamics of Quantitative Traits

Katarína Boďová

Institute of Science and Technology Austria (IST Austria), Klosterneuburg A-3400, Austria

[email protected]

,

Jan Haskovec

Computer, Electrical and Mathematical Sciences & Engineering

King Abdullah University of Science and Technology, 23955 Thuwal, KSA

[email protected]

and

Peter Markowich

Computer, Electrical and Mathematical Sciences & Engineering

King Abdullah University of Science and Technology, 23955 Thuwal, KSA

[email protected]

Abstract.

We study the Fokker Planck equation derived in the large system limit of the Markovian process describing the dynamics of quantitative traits. The Fokker-Planck equation is posed on a bounded domain and its transport and diffusion coefficients vanish on the domain’s boundary. We first argue that, despite this degeneracy, the standard no-flux boundary condition is valid. We derive the weak formulation of the problem and prove the existence and uniqueness of its solutions by constructing the corresponding contraction semigroup on a suitable function space. Then, we prove that for the parameter regime with high enough mutation rate the problem exhibits a positive spectral gap, which implies exponential convergence to equilibrium.

Next, we provide a simple derivation of the so-called Dynamic Maximum Entropy (DynMaxEnt) method for approximation of moments of the Fokker-Planck solution, which can be interpreted as a nonlinear Galerkin approximation. The limited applicability of the DynMaxEnt method inspires us to introduce its modified version that is valid for the whole range of admissible parameters. Finally, we present several numerical experiments to demonstrate the performance of both the original and modified DynMaxEnt methods. We observe that in the parameter regimes where both methods are valid, the modified one exhibits slightly better approximation properties compared to the original one.

1 Introduction
2 Boundary conditions for the stationary problem
3 Existence and uniqueness of solutions
3.1 Formal calculations
3.2 Construction of solutions for the case $4N\mu\geq 1/2$
4 Spectral gap - exponential convergence to equilibrium
5 The Dynamical Maximum Entropy Approximation
5.1 Constrained entropy maximization
5.2 Derivation of the DynMaxEnt method
5.3 Scalar case
5.3.1 Solvability of the moment equation
6 Numerical experiments
6.1 Scalar case
6.2 Vector case

1. Introduction

The dynamics of allele frequencies ${\bf x}=(x_{1},\dots,x_{L})$ , where $L$ is the number of loci that contribute to the trait, can be described by a diffusion process using a deterministic forward Kolmogorov equation. The evolution of the joint probability density $u=u(t,{\bf x})$ of allele frequencies for a population of $N$ diploid individuals satisfies the linear Fokker-Planck equation

[TABLE]

on $\Omega_{\bf x}:=(0,1)^{L}$ , where we denoted $\xi_{i}:=\xi(x_{i})=x_{i}(1-x_{i})$ for $i=1,\dots,L$ . The diffusion term captures the stochasticity of the allele frequencies arising from random sampling. Here we assume that linkage disequilibria are negligible, otherwise this term would be of cross-diffusion type, reflecting correlations between loci [6]. The drift term captures deterministic effects on allele frequencies that are described by a vector of coefficients ${\bm{\alpha}}$ and a vector of complementary quantities ${\bf A}$ . We consider directional selection and dominance with symmetrical mutation, which, using the notation of [6], corresponds to the choice

[TABLE]

and

[TABLE]

where the nondimensional parameters $\beta,h,\gamma_{i},\eta_{i}\in\mathbb{R}$ represent the effects of loci on the traits, $\mu>0$ is the mutation rate, and $\xi_{i}^{\prime}:=\xi^{\prime}(x_{i})=1-2x_{i}$ . For notational simplicity and without loss of generality, we set $\beta=h=1$ in the sequel, so that

[TABLE]

This drift-diffusion process (1.1) is known to be an accurate continuous-time approximation to a wide range of specific population genetics models [15, 16, 11, 10]. In order to represent the population in terms of allele frequencies, we must assume that linkage disequilibria are negligible, which will be accurate if recombination is sufficiently fast. For simplicity, we also assume two alleles per locus.

The main difficulty for analysis of the Fokker-Planck equation (1.1) is the degeneracy of the diffusion coefficients $\xi_{i}=x_{i}(1-x_{i})$ at the boundary of $\Omega_{\bf x}$ . Consequently, the task of prescribing boundary conditions that lead to a well-posed problem is far from obvious; see also [7, 8] for related issues in population genetics problems. As noted above, we aim at interpreting the solution $u$ as a time-dependent probability density, which calls for a no-flux boundary condition. In Section 2 we argue that the standard no-flux boundary condition is indeed appropriate for (1.1). In Section 3 we derive the weak formulation of (1.1) subject to the no-flux boundary condition and prove the existence and uniqueness of its solutions by constructing the corresponding contraction semigroup. Then, in Section 4 we prove that for the parameter regime with high enough mutation rate the problem exhibits a positive spectral gap, which implies exponential convergence to equilibrium.

In typical applications in quantitative genetics the solution of the Fokker-Planck equation (1.1) is not the main object of interest. One is rather interested in the evolution of its certain moments that correspond to the macroscopic dynamics of observable quantitative traits. Therefore, Section 5 is devoted to the study of the so-called Dynamic Maximum Entropy (DynMaxEnt) method for approximation of moments of the Fokker-Planck solution. We first show in Section 5.1 that a related constrained entropy maximization is equivalent to a moment-matching problem, which we solve in a simple case. Then, in Section 5.2 we provide a simple and straightforward derivation of the DynMaxEnt method by adopting a quasi-stationary approximation, which results in a nonlinear system of ordinary differential equations. It can be interpreted as a nonlinear Galerkin approximation of the Fokker-Planck equation (1.1). However, this ”original” DynMaxEnt method cannot be applied in the regime of small mutations, i.e., when $4N\mu\leq 1$ . This inspires us to introduce a modified version, which is valid for the whole range of admissible parameters, Section 5.3. Finally, in Section 6 we present several numerical experiments to demonstrate the performance of both the original and modified DynMaxEnt methods. We observe that in the parameter regimes where both methods are valid, the modified one exhibits slightly better approximation properties compared to the original one.

The surprisingly good approximation properties of the DynMaxEnt method, as documented by the numerical results in [6] and Section 6 of this paper, suggest that the infinitely-dimensional dynamics of the Fokker-Planck equation (1.1) can be well approximated by suitable finitely-dimensional dynamical systems. This is reminiscent of the recent series of works of E. Titi and collaborators [18, 12, 2, 17, 1] where a data assimilation (downscaling) approach to fluid flow problems is developed, inspired by ideas applied for designing finite-parameters feedback control for dissipative systems. The goal of a data assimilation algorithm is to obtain (numerical) approximation of a solution of an infinitely-dimensional dynamical system corresponding to given measurements of a finite number of observables. In particular, in [18], it has been shown that solutions of the two-dimensional Navier-Stokes equations can be well reconstructed from a relatively low number of low Fourier modes or local averages over finite volume elements. In [12], continuous data assimilation (CPA) algorithm was proposed and analyzed for a two-dimensional Bénard convection problem, where the observables were incorporated as a feedback (nudging) term in the evolution equation of the horizontal velocity. In [2] CPA was applied for downscaling a coarse resolution configuration of the 2D Bénard convection equations into a finer grid, while in [17] the CPA method is studied for a three-dimensional Brinkman-Forchheimer-extended Darcy model of porous media, and in [1] for the three-dimensional Navier-Stokes– $\alpha$ model. Finally, in [13] numerical performance of the CPA algorithm in the context of the two-dimensional incompressible Navier–Stokes equations was studied. It was shown that the numerical method is computationally efficient and performs far better than the analytical estimates suggest. This is similar to our numerical observations showing very good approximation properties of the DynMaxEnt method applied to the Fokker-Planck equation (1.1).

2. Boundary conditions for the stationary problem

The stationary solution of the Fokker-Planck equation (1.1) is of the form

[TABLE]

where $\mathbb{Z}$ is a normalization constant (partition function). We aim at interpreting the solution $u_{\bm{\alpha}}$ as a probability density, therefore, we set

[TABLE]

Observe that the above integral is finite for $\mu>0$ , which we assumed. Let us rewrite (1.1) in the form

[TABLE]

with $u_{\bm{\alpha}}$ defined in (2.1), and the diagonal diffusion matrix $D=D({\bf x})$ , $D_{ij}=\frac{1}{4N}\xi_{i}\delta_{ij}$ for $i,j=1,\dots,L$ .

To provide an insight into the problem of prescribing valid boundary conditions for (2.3), we consider the related stationary problem in the spatially one-dimensional setting,

[TABLE]

for $x\in(0,1)$ , where $f\in L^{1}(0,1)$ is a prescribed function with $\int_{0}^{1}f(s)\,\mathrm{d}s=0$ , $\xi=\xi(x)=x(1-x)$ , $D=\frac{1}{4N}\xi$ and

[TABLE]

with $\mathbb{Z}_{\bm{\alpha}}$ defined in (2.2). We recall that $\mathbb{Z}_{\bm{\alpha}}$ is finite and $u_{\bm{\alpha}}$ is integrable for the relevant range of parameters. Moreover, note that the product $Du_{\bm{\alpha}}$ behaves like $\xi^{4N\mu}$ close to $x=0$ and $x=1$ , so that it vanishes at the boundary and leads to a degeneracy in the formal no-flux boundary condition

[TABLE]

To avoid possible difficulties due to this degeneracy, we integrate (2.4) for a fixed $x\in(0,1)$ on the interval $(1/2,x)$ ,

[TABLE]

where $C_{1}$ is an integration constant. We see that imposing the formal no-flux boundary condition (2.5) at, say, $x=0$ is equivalent to setting $C_{1}$ to the particular value

[TABLE]

The assumption $\int_{0}^{1}f(s)\,\mathrm{d}s=0$ then implies that (2.5) is verified at $x=1$ . Integrating once again yields

[TABLE]

with $F(s):=\int_{1/2}^{s}f(r)\,\mathrm{d}r$ . Observe that

[TABLE]

so that the second term in (2.6) is bounded on $[0,1]$ and thus integrable. Due to the boundedness of $F(s)$ , the same holds also for the third term in (2.6). Consequently, the solution $u$ constructed in (2.6) is integrable on $(0,1)$ .

We conclude that, for the aforementioned range of parameter values, the Fokker-Planck equation (2.3) has to be supplemented with the standard no-flux boundary condition (2.5) regardless of the degeneracy of $Du_{\bm{\alpha}}$ at the boundary. Although the above argument only applies to the spatially one-dimensional setting, it provides a strong heuristic hint that the conclusion also holds in the multidimensional case.

3. Existence and uniqueness of solutions

In this section we construct solutions of the Fokker-Planck equation (2.3), supplemented with the boundary condition

[TABLE]

where $\nu=\nu(x)$ denotes the unit normal vector to the boundary of $\Omega_{\bf x}$ . Moreover, we prescribe the initial condition

[TABLE]

Our strategy is to convert the problem to the Hamiltonian form $(-\Delta+V)$ for a suitable potential $V$ and construct the corresponding semigroup. In order to obtain some intuition, we first carry out the transform formally.

3.1. Formal calculations

Setting

[TABLE]

(2.3) is written in the form

[TABLE]

with the boundary condition $D(\nabla_{\bf x}u+u\nabla_{\bf x}\mathcal{B})\cdot\nu=0$ . For $i=1,\dots,N$ we introduce the coordinate transform

[TABLE]

and denote ${\bf y}:=(y_{1},\dots,y_{L})$ . Note that ${\bf x}\mapsto{\bf y}$ maps $\Omega_{\bf x}=(0,1)^{L}$ onto $\Omega_{\bf y}:=(0,Y_{N})^{L}$ with $Y_{N}:=2\pi\sqrt{N}$ . Introducing the new variable

[TABLE]

transforms (2.3) to the form

[TABLE]

with $\overline{\mathcal{B}}({\bf y}):=\mathcal{B}({\bf x}({\bf y}))$ and $\overline{J}({\bf y}):=J({\bf x}({\bf y}))$ . By ${\bf x}({\bf y})$ we denote the componentwise inverse transform $x_{i}=x(y_{i})$ . The no-flux boundary condition (3.1) transforms as

[TABLE]

where we use the shorthand notation $\xi_{i}=\xi(x_{i}(y_{i}))$ . Note that the product $\overline{J}^{-1}\sqrt{\xi_{i}}$ is constant in $y_{i}$ and positive on the set $\bigl{\{}{\bf y}\in\partial\Omega_{\bf y};y_{i}\in\{0,Y_{N}\bigr{\}},0<y_{j}<Y_{N}\mbox{ for }j\neq i\}$ . Consequently, the transformed boundary condition is equivalent to the nondegenerate expression

[TABLE]

which can be also written as $\nu\cdot\nabla_{\bf y}\ln(\overline{u}/\overline{u}_{\bm{\alpha}})=0$ a.e. on $\partial\Omega_{\bf y}$ . The steady state for (3.5)–(3.6) is

[TABLE]

Finally, setting

[TABLE]

the Fokker-Planck equation (2.3) transforms to the Hamiltonian form

[TABLE]

with

[TABLE]

which can be further expressed as

[TABLE]

The boundary condition (3.6) transforms to

[TABLE]

Let us remark that with (1.2), $\overline{u}_{\bm{\alpha}}$ behaves like $\left(\prod_{i=1}^{L}\xi_{i}\right)^{4N\mu-1/2}$ close to the boundary, so that for $4N\mu-1/2>0$ the boundary condition (3.9) is degenerate.

Inserting the expression (1.2) for ${\bm{\alpha}}\cdot{\bf A}$ into (3.8) gives the explicit expression for the potential

[TABLE]

where (bounded terms) are expressions involving

[TABLE]

that are uniformly bounded on $\overline{\Omega}_{\bf y}$ . The unbounded term in $V$ is

[TABLE]

so for the potential to be bounded below, we need $4N\mu\geq 3/2$ .

3.2. Construction of solutions for the case $4N\mu\geq 1/2$

In this Section we shall construct weak solutions of the Fokker-Planck equation (2.3) with $4N\mu\geq 1/2$ , subject to the no-flux boundary condition (3.1) and the initial datum (3.2). However, since the equivalent form (3.7) is more suitable to study the asymptotic behavior of the solution for large times, we shall work with this formulation. Due to the issues caused by the degeneracy of the boundary condition, we shall start from a weak formulation of (2.3) and carry out the coordinate transform as in previous Section in order to arrive at a weak formulation of (3.7).

To obtain a symmetric form, we multiply (2.3) by $\varphi/u_{\bm{\alpha}}$ , with a test function $\varphi\in C^{\infty}(\overline{\Omega}_{\bf x})$ , and integrate by parts, taking into account the no-flux boundary condition (3.1). We arrive at

[TABLE]

Carrying out the coordinate transform ${\bf x}\mapsto{\bf y}$ (3.3), with the Jacobian $J$ given by (3.4), yields

[TABLE]

with $\overline{u}$ given by (3.4), $\overline{\varphi}({\bf y}):=J({\bf x}({\bf y}))\varphi({\bf x}({\bf y}))$ and $\,\mathrm{d}\overline{u}_{\bm{\alpha}}({\bf y}):=\overline{u}_{\bm{\alpha}}\,\mathrm{d}{\bf y}$ . Finally, defining $z:=\overline{u}/\sqrt{\overline{u}_{\bm{\alpha}}}$ and $\psi:=\overline{\varphi}/\sqrt{\overline{u}_{\bm{\alpha}}}$ , we arrive at

[TABLE]

We thus define the space

[TABLE]

with the scalar product

[TABLE]

and the induced norm $\left\|z\right\|_{\mathcal{H}_{\bf y}}^{2}:=(z,z)_{\mathcal{H}_{\bf y}}$ . Central for our analysis is the following result.

Lemma 1.

Let $4N\mu\geq 1/2$ . Then for every $z\in\mathcal{H}_{\bf y}$ the inequality holds

[TABLE]

with $V$ defined in (3.8).

**Proof: **We have

[TABLE]

We integrate by parts in the last term of the right-hand side,

[TABLE]

With (1.2) we have

[TABLE]

Since $\xi_{i}$ vanishes for $y_{i}\in\{0,Y_{N}\}$ and $\xi_{i}^{\prime}$ is bounded on $[0,Y_{N}]$ , we have

[TABLE]

We write the boundary of the hypercube $\Omega_{\bf y}$ as an union of the pairs of faces,

[TABLE]

then we have

[TABLE]

where $\,\mathrm{d}S_{F_{i}}$ denotes the $(L-1)$ -dimensional Lebesgue measure on $F_{i}$ . Since $x_{i}^{\prime}(x(Y_{N}))=x_{i}^{\prime}(1)=-1$ and $x_{i}^{\prime}(x(0))=x_{i}^{\prime}(0)=1$ , we have

[TABLE]

Therefore, if $4N\mu-\frac{1}{2}\geq 0$ ,

[TABLE]

Consequently,

[TABLE]

Finally, the above formal calculation are made rigorous by replacing $\overline{u}_{\bm{\alpha}}$ by $\overline{u}_{\bm{\alpha}}^{\varepsilon}:=\overline{u}_{\bm{\alpha}}+\varepsilon$ for $\varepsilon>0$ and subsequently passing to the limit $\varepsilon\to 0$ .

Lemma 2.

Let $4N\mu\geq 1/2$ . Then the space $\mathcal{H}_{\bf y}$ defined in (3.12) with the scalar product $(\cdot,\cdot)_{\mathcal{H}_{\bf y}}$ is a Hilbert space, and is densely embedded into $L^{2}(\Omega_{\bf y})$ .

**Proof: **Completeness follows from the fact that if $z_{k}$ is a Cauchy sequence in $\mathcal{H}_{\bf y}$ , then due to Lemma 1 it is also a Cauchy sequence in $L^{2}$ . The density of the embedding into $L^{2}(\Omega_{\bf y})$ is due to the fact that the set of smooth functions with compact support is dense in $\mathcal{H}_{\bf y}$ .

Definition 1.

We call $z\in L^{2}((0,T);\mathcal{H}_{\bf y})\cap C([0,T];L^{2}(\Omega_{\bf y}))$ a weak solution of (3.7) on $[0,T)$ subject to the boundary condition (3.9) if (3.11) holds for every $\psi\in\mathcal{H}_{\bf y}$ and almost all $t\in(0,T)$ , and the initial condition is satisfied by continuity in $C([0,T];L^{2}(\Omega_{\bf y}))$ .

We remark that a formal integration by parts in the right-hand side of (3.11) gives

[TABLE]

This justifies the interpretation of (3.11) as the weak formulation of (3.7) subject to the boundary condition (3.9).

We now define the operator $\mathcal{L}:D(\mathcal{L})\subset\mathcal{H}_{\bf y}\to L^{2}(\Omega_{\bf y})$ by its action

[TABLE]

for all $z$ , $\psi\in\mathcal{H}_{\bf y}$ . We shall prove that the closure $\overline{\mathcal{L}}$ of $\mathcal{L}$ generates a contraction semigroup on $L^{2}(\Omega_{\bf y})$ . For this sake, we study the resolvent problem

[TABLE]

for (some) $\lambda>0$ and $f\in L^{2}(\Omega_{\bf y})$ .

Lemma 3.

Let $4N\mu\geq 1/2$ . Then for every $f\in L^{2}(\Omega_{\bf y})$ the resolvent problem (3.14) has a unique solution $z\in\mathcal{H}_{\bf y}$ .

**Proof: **For a fixed $\lambda>0$ we define the bilinear form $a:\mathcal{H}_{\bf y}\times\mathcal{H}_{\bf y}\to\mathbb{R}$ ,

[TABLE]

where $(z,\psi)$ denotes the standard scalar product on $L^{2}(\Omega_{\bf y})$ . The resolvent problem (3.14) with the no-flux boundary conditions is equivalent to

[TABLE]

A straightforward application of the Hölder inequality gives the continuity of $a_{\lambda}$ ,

[TABLE]

for a suitable constant $C>0$ ; coercivity is straightforward. Finally, the mapping $\psi\mapsto(f,\psi)$ with $f\in L^{2}(\Omega_{\bf y})$ is an element of the dual space $(\mathcal{H}_{\bf y})^{\prime}$ . Consequently, an application of the Lax-Milgram theorem yields the existence and uniqueness of the solution $z\in\mathcal{H}_{\bf y}$ .

Theorem 1.

Let $4N\mu\geq 1/2$ . Then the closure $\overline{\mathcal{L}}$ of $\mathcal{L}$ generates a contraction semigroup on $L^{2}(\Omega_{\bf y})$ .

**Proof: **Since $\mathcal{H}_{\bf y}$ is densely embedded into $L^{2}(\Omega_{\bf y})$ , the operator $\mathcal{L}$ is densely defined, and dissipative. Moreover, due to Lemma 3, the range of $-\mathcal{L}+\lambda$ is $L^{2}(\Omega_{\bf y})$ for all $\lambda>0$ . The claim then follows by an application of the Lumer-Phillips theorem [20].

The contraction semigroup constructed in Theorem 1 provides the announced existence and uniqueness of weak solutions $z\in L^{2}((0,T);\mathcal{H}_{\bf y})\cap C([0,T];L^{2}(\Omega_{\bf y}))$ of (3.7) subject to the no-flux boundary condition (3.9) in the sense of Definition 1. The solutions are formally written as $z(t)=e^{\mathcal{L}t}z_{0}$ , where $z_{0}\in L^{2}(\Omega_{\bf y})$ is the initial datum; see, e.g., [20]. By the inverse coordinate transform to (3.3) we obtain weak solutions of the original Fokker-Planck equation (1.1) subject to the no-flux boundary condition (3.1).

4. Spectral gap - exponential convergence to equilibrium

In this Section we shall perform a spectral analysis of the operator $(-\mathcal{L})$ and prove that boundedness below of the potential $V$ (3.8) implies exponential convergence to equilibrium for (3.7). From the explicit expression (3.10) for $V$ we see that $V$ is bounded below if $4N\mu\geq 3/2$ .

Lemma 4.

Let $4N\mu\geq 3/2$ . Then the operator $(-\mathcal{L})$ defined in (3.13) has compact resolvent.

**Proof: **We need to show that for some $\lambda>0$ the operator $(-\mathcal{L}+\lambda)^{-1}$ is compact as a mapping from $L^{2}(\Omega_{\bf y})$ into itself. Let $f\in L^{2}(\Omega_{\bf y})$ and $z=(-\mathcal{L}+\lambda)^{-1}f$ , constructed in Lemma 3. From Lemma 1 we have

[TABLE]

for some constant $C>0$ and $\lambda$ chosen such that $\min_{{\bf y}\in\Omega_{\bf y}}(V({\bf y})+\lambda)>0$ . On the other hand, the Cauchy-Schwartz inequality gives

[TABLE]

so for sufficiently small $\varepsilon>0$ we conclude

[TABLE]

and the claim follows by the compact embedding of the Sobolev space $H^{1}$ into $L^{2}$ .

Together with the obvious self-adjointness of $(-\mathcal{L})$ , Lemma 4 implies that $(-\mathcal{L})$ has a discrete spectrum without finite accumulation points. Moreover, all its eigenvalues are nonnegative. This implies the existence of a positive spectral gap and, consequently, exponential convergence to equilibrium as $t\to\infty$ , see, e.g., [3].

5. The Dynamical Maximum Entropy Approximation

In typical applications in quantitative genetics the solution of the Fokker-Planck equation (1.1) is not the main object of interest. One is rather interested in the evolution of its certain moments that correspond to the macroscopic dynamics of observable quantitative traits. This naturally leads to the question whether one can derive a finite-dimensional system of differential equations that approximates the evolution of the moments of interest, avoiding the need of solving (1.1). This question has been studied previously by analogy with statistical mechanics: the allele frequency distribution is approximated by the stationary form, which maximizes the logarithmic relative entropy. Called Maximum Entropy Method, it has been applied to broad spectrum of problems ranging from the statistics of neural spiking [24, 26], bird flocking [5], protein structure [27], immunology [19] and more. For transient problems described by known dynamical equations (e.g., Fokker-Planck equation), the Dynamical Maximum Entropy (DynMaxEnt) method assumes quasi-stationarity at each time point. It has been applied, e.g., to modeling of cosmic ray transport [14], general Fokker-Planck equation [21], analysis of genetic algorithms [22], and population genetics [23, 4, 6]. In [6] it is observed that the ”classical” DynMaxEnt method cannot be applied in the regime of small mutations, and the theory is extended for this regime to account for changes in mutation strength. Surprisingly, systematic numerical simulations document superb approximation properties of the method even far from the quasi-stationary regime. However, derivation of analytic error estimates remains an open problem.

In this section we discuss several aspects of the DynMaxEnt method. First, in Section 5.1 we show that constrained maximization of a logarithmic entropy functional leads to a moment-matching condition. Then, in Section 5.2 we provide a simple and straightforward derivation of the DynMaxEnt method by adopting a quasi-stationary approximation. To our best knowledge, this derivation has not been known before. Finally, in Section 5.3 we consider the scalar case and derive a modified version of the DynMaxEnt method, which is valid for the whole range of admissible parameters.

5.1. Constrained entropy maximization

We shall call the vector ${\bm{\alpha}}\in\mathbb{R}^{d}$ admissible if the corresponding normalization factor $\mathbb{Z}_{\bm{\alpha}}$ (2.2) is finite. For any integrable function $u\in L^{1}(\Omega_{\bf x})$ with $\int_{\Omega_{\bf x}}u({\bf x})\,\mathrm{d}{\bf x}=1$ and any admissible ${\bm{\alpha}}\in\mathbb{R}^{d}$ we define the logarithmic relative entropy

[TABLE]

where $u_{\bm{\alpha}}$ is the normalized stationary solution of the Fokker-Planck equation (1.1), given by formula (2.1). Note that this is a different approach compared with [6], where the logarithmic entropy is taken relative to the neutral distribution of allele frequencies in the absence of mutation or selection, $\prod_{i=1}^{L}\xi_{i}^{-1}$ , and the variational problem is complemented with normalization and moment constraints.

For a fixed $u\in L^{1}(\Omega_{\bf x})$ with finite ${\bf A}$ -moments, let us consider the maximization of the relative entropy (5.1) in terms of admissible ${\bm{\alpha}}\in\mathbb{R}^{d}$ , i.e., the task of maximizing the function ${\bm{\alpha}}\mapsto H(u|u_{\bm{\alpha}})$ . If a critical point exists, then for $i=1,\dots,d$ ,

[TABLE]

Consequently, if a maximizer ${\bm{\alpha}}^{*}$ exists, then the ${\bf A}$ -moments corresponding to $u_{{\bm{\alpha}}^{*}}$ must be matching the same moments of $u$ . This naturally leads to the question of solvability of the nonlinear system of equations

[TABLE]

in terms of the admissible parameter vector ${\bm{\alpha}}\in\mathbb{R}^{d}$ , for a given, normalized $u\in L^{1}(\Omega_{\bf x})$ with finite ${\bf A}$ -moments. To address this question seems to be a very difficult task that we leave open. We merely remark that the Hessian matrix of ${\bm{\alpha}}\mapsto H(u|u_{\bm{\alpha}})$ ,

[TABLE]

is equal to the covariance matrix of the random variables ${\bf A}$ with the probability density $u_{\bm{\alpha}}$ . Thus, the Hessian matrix is positive semidefinite. In the scalar case, solvability of the moment equation $\left\langle A\right\rangle_{u_{\alpha}}=\left\langle A\right\rangle_{u}$ can be studied for particular choices of $A$ . We shall give an example below in Section 5.3.1.

5.2. Derivation of the DynMaxEnt method

Let us consider $u=u(t)$ a solution of the Fokker-Planck equation (1.1) with admissible parameter vector ${\bm{\alpha}}$ , subject to the initial datum $u(t=0)=u_{{\bm{\alpha}}^{0}}$ for some admissible ${\bm{\alpha}}^{0}$ . The DynMaxEnt method is derived in two steps: First, we multiply the equation in its form (2.3) by the vector ${\bf A}$ and integrate,

[TABLE]

where we assumed that the boundary term in the integration by parts vanishes (note that, in general, this does not necessarily follow from (3.1)). In the second step, we substitute $u(t)$ in the above expression by $u_{{\bm{\alpha}}^{*}(t)}$ with some time-dependent parameter vector ${\bm{\alpha}}^{*}={\bm{\alpha}}^{*}(t)$ , which leads to

[TABLE]

where $\mathbf{R}$ is a vector-valued residuum term. We now introduce an approximation by neglecting the residuum $\mathbf{R}$ . Expanding the derivatives on both sides of the above equation leads then to

[TABLE]

where $\nabla_{\bf x}{\bf A}:\nabla_{\bf x}{\bf A}$ is the symmetric $d\times d$ matrix with the $(i,k)$ -component $\sum_{j=1}^{d}\partial_{x_{j}}A_{i}\partial_{x_{j}}A_{k}$ . The nonlinear ODE system for ${\bm{\alpha}}^{*}={\bm{\alpha}}^{*}(t)$ is called the DynMaxEnt method for approximation of the moments of (1.1). However, two comments have to be made: First, the matrix on the left-hand side,

[TABLE]

is positive semidefinite, since it is the covariance matrix of the observables ${\bf A}$ of the probability distribution $u_{{\bm{\alpha}}^{*}(t)}$ . However, in order (5.2) to be globally solvable, the covariance matrix must be uniformly (positive) definite, which in general may not be the case. Furthermore, the matrix $\left\langle\xi\nabla_{\bf x}{\bf A}:\nabla_{\bf x}{\bf A}\right\rangle_{u_{{\bm{\alpha}}^{*}(t)}}$ may have infinite entries even for some admissible ${{\bm{\alpha}}^{*}(t)}$ , and if this is the case, then again the ODE system is not solvable. Since these two issues are very hard to resolve in general, we shall below resort to a simple case where ${\bm{\alpha}}$ is a scalar.

5.3. Scalar case

To gain some more insight into the ODE (5.2), we consider the single locus case $x\in(0,1)$ with ${\bf A}$ being a scalar function $A=A(x)$ and $\alpha\in\mathbb{R}$ . The DynMaxEnt method (5.2) simplifies to the following ODE for $\alpha^{*}=\alpha^{*}(t)$ ,

[TABLE]

An application of the Cauchy-Schwartz inequality implies that

[TABLE]

and, moreover, equality holds if and only if $A$ is a constant function. Consequently, for every nonconstant $A$ the ODE (5.3) can be rewritten as

[TABLE]

However, the question of finiteness of the moment $\left\langle\xi(\partial_{x}A)^{2}\right\rangle_{u_{\alpha^{*}(t)}}$ can be only answered by making a particular choice for $A=A(x)$ .

As a toy model, let us choose $A=A(x)$ to be the scalar function $\ln(\xi(x))$ . This corresponds to a population of individuals in a neutral environment ( $\beta=h=0$ in (1.2)) with the nonzero mutation rate $\alpha=2\mu$ . With the singularities at $x\in\{0,1\}$ , the function $A(x)=\ln(\xi(x))$ well represents the issues that one encounters with the generic choice (1.2). It is easily checked that the set of admissible values of $\alpha$ is the interval $(0,\infty)$ . Moreover, the moment $\left\langle\xi(\partial_{x}A)^{2}\right\rangle_{u_{\alpha^{*}}}$ is only finite for $\alpha^{*}>1$ . Consequently, the DynMaxEnt method (5.4) is only applicable if both the initial value $\alpha^{*}(0)=\alpha^{0}$ and $\alpha$ are strictly larger than $1$ . Then, since obviously the solution $\alpha^{*}(t)$ of (5.4) is a monotone function of time, it will stay strictly larger than $1$ for all $t\geq 0$ and asymptotically converge to $\alpha$ .

The issue of non-finiteness of the term $\left\langle\xi(\partial_{x}A)^{2}\right\rangle_{u_{\alpha^{*}(t)}}$ was addressed in [6] by introducing a special treatment near the boundary (see Appendix E, equations E.10-E.13 of [6] for details of the derivation of the modified method). Here we propose an alternative way that treats the problem at least in the case $A(x):=\ln(\xi(x))$ . It is based on the idea of multiplying the Fokker-Planck equation by a suitable function $B=B(x)$ , instead of $A=A(x)$ , and integrating on $\Omega_{\bf x}$ . In the second step, one again approximates $u(t)$ by $u_{{\bm{\alpha}}^{*}(t)}$ and neglects the residuum. This leads, in the scalar case, to the ODE

[TABLE]

Choosing $B(x):=\xi(x)$ leads then to finite $\left\langle\xi(\partial_{x}B)^{2}\right\rangle_{u_{\alpha^{*}}}$ for all $\alpha^{*}>0$ , i.e., for all admissible values of $\alpha^{*}$ . Thus, our strategy is to obtain $\alpha^{*}(t)$ by solving (5.5) for $t\geq 0$ and then calculate the moment $\left\langle\ln(\xi)\right\rangle_{u_{\alpha^{*}(t)}}$ , which is expected to be a good approximation of the true moment $\left\langle\ln(\xi)\right\rangle_{u(t)}$ . Clearly, one can use this strategy to obtain an approximation of any other moment of $u(t)$ .

However, the method (5.5) suffers from a serious drawback, namely, it is only solvable if the covariance

[TABLE]

is nonvanishing for all $t\geq 0$ , which is not clear. Nonetheless, for the particular choice $A(x)=\ln(\xi(x))$ and $B(x)=\xi(x)$ this seems to be the case, as is documented by our numerical calculation in Fig. 1. Analytically we are only able to calculate the limits

[TABLE]

which is based on the following Lemma.

Lemma 5.

*For $\sigma>0$ and $x\in(0,1)$ denote *

[TABLE]

with $\xi(x)=x(1-x)$ . Then, in the sense of distributions,

[TABLE]

where $\delta(\cdot-x)$ denotes the Dirac-delta distribution concentrated at $x$ .

**Proof: **Obviously, $\nu_{\sigma}$ is a probability measure on $(0,1)$ . Let $\varphi\in C_{c}^{\infty}(0,1)$ be any test function on the interval $(0,1)$ . We shall show that

[TABLE]

The mean-value theorem gives

[TABLE]

Thus, our goal is to show that $\int_{0}^{1}|x-1/2|\,\mathrm{d}\nu_{\sigma}(x)$ vanishes as $\sigma\to\infty$ . For the numerator, we have

[TABLE]

and using the identity $1/2-x=\xi^{\prime}(x)/2$ , we calculate

[TABLE]

The denominator is estimated from below using the elementary inequalities

[TABLE]

which give

[TABLE]

Thus,

[TABLE]

which proves the first claim.

To calculate the limit $\sigma\to 0$ , due to the symmetry of $\xi(x)=x(1-x)$ with respect to $x=1/2$ , it is sufficient to prove that

[TABLE]

Again, picking a test function $\varphi\in C_{c}^{\infty}[0,1/2)$ and using the mean-value theorem, we have to show that

[TABLE]

tends to zero as $\sigma\to 0$ . However, this follows directly from the fact that the numerator is uniformly bounded for, say, $0\leq\sigma<1$ , and that, obviously, the denominator tends to $+\infty$ as $\sigma\to 0$ .

The statement (5.6) follows directly from the fact that for $A(x)=\ln(\xi(x))$ we readily have $u_{\alpha^{*}}=\nu_{\alpha^{*}}$ with $\nu_{\alpha^{*}}$ given by (5.7).

Consequently, the ”modified” DynMaxEnt method (5.5) can be safely used with $A(x)=\ln(\xi(x))$ and $B(x)=\xi(x)$ . It even seems to provide better approximation results than the ”original” method (5.3), as is documented by our numerical experiments in Section 6.

5.3.1. Solvability of the moment equation

Finally, we study the solvability with respect to $\alpha>0$ of the moment equation

[TABLE]

with $A(x)=\ln(\xi(x))$ , assuming that the right-hand side is finite. First of all, we note that the mapping $\alpha\mapsto\left\langle A\right\rangle_{u_{\alpha}}$ is strictly increasing for $\alpha>0$ . Indeed,

[TABLE]

where the strict positivity follows as before by the Cauchy-Schwartz inequality. Consequently, if a solution to the moment equation (5.8) exists, it is unique. Next we claim that for any $u\geq 0$ with $\int_{0}^{1}u(x)\,\mathrm{d}x=1$ we have $\left\langle\ln(\xi)\right\rangle_{u}\in[-\infty,\ln(1/4))$ . Indeed, since $\xi(x)<1/4$ on $(0,1)\setminus\{1/2\}$ ,

[TABLE]

Thus, it remains to prove that the range of $\alpha\mapsto\left\langle A\right\rangle_{u_{\alpha}}$ is the interval $(-\infty,\ln(1/4))$ . Since for $A(x)=\ln(\xi(x))$ we readily have $u_{\alpha}=\nu_{\alpha}$ with $\nu_{\alpha}$ given by (5.7), Lemma 5 gives

[TABLE]

Indeed, the range of the mapping $\alpha\mapsto\left\langle\ln(\xi)\right\rangle_{u_{\alpha}}$ is the interval $(-\infty,\ln(1/4))$ and, therefore, the moment equation (5.8) is uniquely solvable for every normalized $u\in L^{1}(0,1)$ with finite $\left\langle\ln(\xi)\right\rangle_{u}$ -moment.

6. Numerical experiments

6.1. Scalar case

We present results of several numerical experiments that aim to demonstrate the performance of the original (5.3) and modified (5.5) DynMaxEnt methods for the scalar (single locus) case $A(x)=\ln(\xi(x))$ , as discussed in Section 5.3. Let us recall that this case corresponds to a population of individuals in a neutral environment ( $\beta=h=0$ in (1.2)) with the nonzero mutation rate $\alpha=2N\mu$ . For the modified method (5.5) we again choose $B(x)=\xi(x)$ .

In all simulations we set $N=1$ and start from the initial condition $\alpha^{*}(t=0)=\alpha^{0}:=2$ for the ODEs (5.3), (5.5), and the initial datum $u(t=0)=u_{\alpha^{0}}$ for the Fokker-Planck equation (2.3). The ODEs (5.3), (5.5) are solved with simple forward Euler discretization on the time interval $[0,T]$ for different values of $T>0$ . We use $B(x)=\xi(x)$ for the modified DynMaxEnt method (5.5). The Fokker-Planck equation is discretized in space using the Chang-Cooper scheme [9] and forward Euler in time.

In Fig. 2 we plot the time evolution of the $\left\langle\ln(\xi)\right\rangle$ -moment of the Fokker-Planck solution $u(t)$ and its approximation obtained by the DynMaxEnt methods (5.3), (5.5) for the parameter values $\alpha\in\{1.1,1.5,2.5,3\}$ . Note that since $\alpha>1$ , both the methods (5.3), (5.5) are applicable. However, we observe that the modified method (5.5) gives better approximation results. To quantify the approximation error, we calculate the indicator

[TABLE]

where $\left\langle\ln(\xi)\right\rangle_{u_{\alpha^{*}(t)}}$ is the moment calculated by one of the DynMaxEnt methods (5.3), (5.5). The results for the values $\alpha\in\{1.1,1.5,2.5,3\}$ given in Table 1 indeed suggest that the modified method (5.5) provides better approximation of the moment $\left\langle\ln(\xi)\right\rangle_{u(t)}$ . Moreover, we observe that with increasing value of $\alpha$ the approximation properties of both methods seem to improve.

In Fig. 3 we plot the time evolution of the $\left\langle\ln(\xi)\right\rangle$ -moment of the Fokker-Planck solution $u(t)$ and its approximation obtained by the modified DynMaxEnt method (5.5) for the parameter values $\alpha\in\{0.7,0.5,0.3,0.2\}$ . Note that the original method (5.3) is no longer applicable since for $\alpha^{*}<1$ the term $\left\langle\xi(\partial_{x}\ln(\xi))^{2}\right\rangle_{u_{\alpha^{*}}}$ is not finite. Again, we calculate the approximation error (6.1) for the above mentioned valued of $\alpha$ in Table 2. We observe that the approximation worsens for smaller values of $\alpha$ . This is presumably a consequence of the singularity of $u_{\alpha}$ at $x\in\{0,1\}$ becoming stronger when $\alpha$ approaches zero. In fact, numerical solution of the Fokker-Planck equation (2.3) also becomes more difficult for small values of $\alpha$ . For $\alpha<0.2$ our discrete scheme ceases to provide reliable results. That is why $\alpha=0.2$ is the smallest value that we take into account.

6.2. Vector case

Finally, we consider the more general case with the function ${\bf A}={\bf A}({\bf x})$ being vector-valued, ${\bf A}:\Omega_{\bf x}\to\mathbb{R}^{k}$ with some $k\in\mathbb{N}$ . It has been observed in [10, 6] that using more moments (i.e., higher $k$ ) in general improves the approximation properties of the DynMaxEnt method. Inspired by the success of the modified DynMaxEnt method (5.5) demonstrated in Section 6.1, we consider an analogous approach also in the vector case. For this, we employ the idea of deriving a modified DynMaxEnt method as in Section 5.3: We multiply the Fokker-Planck equation (2.3) by a vector-valued function ${\bf B}:\Omega_{\bf x}\to\mathbb{R}^{k}$ to be chosen later and integrate by parts, assuming the boundary terms to vanish. Then, we approximate $u(t)$ by $u_{{\bm{\alpha}}^{*}(t)}$ with the time-dependent vector ${\bm{\alpha}}^{*}={\bm{\alpha}}^{*}(t)$ and neglect the residual term. This gives

[TABLE]

where ${\bf B}\otimes{\bf A}$ is the $d\times d$ matrix with the $(i,k)$ -component $B_{i}A_{k}$ and $\nabla_{\bf x}{\bf B}:\nabla_{\bf x}{\bf A}$ is the $d\times d$ matrix with the $(i,k)$ -component $\sum_{j=1}^{d}\partial_{x_{j}}B_{i}\partial_{x_{j}}A_{k}$ . Clearly, uniform invertibility of the matrix

[TABLE]

is necessary for global solvability of the ODE system (6.2). This condition is satisfied in our numerical experiments below.

The goal of this Section is to illustrate the performance of the original (5.2) and modified (6.2) DynMaxEnt methods for the generic choice ${\bf A}=(\xi^{\prime},\xi,\ln\xi)$ . For simplicity, we shall still stick to the 1D (single locus) setting $x\in(0,1)$ . Choosing again $\beta=h=1$ in (1.2), we have

[TABLE]

with ${\bm{\alpha}}=(-\gamma,2\eta,2\mu)$ . The parameters $\gamma,\eta\in\mathbb{R}$ represent the effects of loci on the traits and $\mu>0$ is the mutation rate. For the modified DynMaxEnt method (6.2) we choose ${\bf B}=(\xi^{\prime},\xi,\xi^{2})$ . Note that this choice prevents the issue of non-finitness of the moment $\left\langle\xi\nabla_{\bf x}{\bf A}\otimes\nabla_{\bf x}{\bf A}\right\rangle_{u_{{\bm{\alpha}}^{*}(t)}}$ for $4N\mu<1$ .

We carry out two numerical experiments. In both simulations we set $N=1$ and the initial condition for the Fokker-Planck equation (2.3) to be the stationary distribution (2.1) with the parameters $4\mu_{0}=2$ , $\eta_{0}=-1$ , $\gamma_{0}=2$ . As before, the Fokker-Planck equation is discretized in space using the Chang-Cooper scheme [9] and forward Euler method in time. The ODE systems (5.2), (6.2) are discretized in time using the forward Euler scheme.

For the first experiment we use the parameter values $4\mu=1.1$ , $\eta=1$ , $\gamma=0$ . This corresponds to the abrupt change of parameters (evolutionary forces)

[TABLE]

In Fig. 4 we plot the time evolution of the moments $\left\langle\ln(\xi)\right\rangle$ , $\left\langle\xi\right\rangle$ and $\left\langle\xi^{\prime}\right\rangle$ of the Fokker-Planck solution $u(t)$ and its approximation obtained by the original and, resp., modified DynMaxEnt methods (5.2), resp., (6.2). Note that in this case both methods (5.2), (6.2) are applicable since the moment $\left\langle\xi\nabla_{\bf x}{\bf A}\otimes\nabla_{\bf x}{\bf A}\right\rangle_{u_{{\bm{\alpha}}^{*}(t)}}$ is finite for all $t\geq 0$ . Calculating the error of approximation (6.1) for the three moments, Table 3, we observe that the modified method (6.2) provides slightly more accurate results.

For our second experiment we use the parameter values $4\mu=0.5$ , $\eta=1$ , $\gamma=0$ . This corresponds to the rapid change of evolutionary forces

[TABLE]

Note that in this case the original method (5.2) is not applicable any more since the moment $\left\langle\xi\nabla_{\bf x}{\bf A}\otimes\nabla_{\bf x}{\bf A}\right\rangle_{u_{{\bm{\alpha}}^{*}(t)}}$ is not defined for $4\mu<1$ . In Fig. 5 we plot the time evolution of the moments $\left\langle\ln(\xi)\right\rangle$ , $\left\langle\xi\right\rangle$ and $\left\langle\xi^{\prime}\right\rangle$ of the Fokker-Planck solution $u(t)$ and its approximation obtained by the modified DynMaxEnt method (6.2). On the other hand, the results presented in Fig. 5 and Table 4 indicate that the modified DynMaxEnt method (6.2) provides a reasonably good approximation of the three moments.

Acknowledgments. We thank Nicholas Barton (IST Austria) for his useful comments and suggestions. JH and PM are funded by KAUST baseline funds and grant no. 1000000193.

Bibliography27

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] D. Albanez, H. Nussenzveig Lopes, and E. Titi: Continuous data assimilation for the three-dimensional Navier-Stokes– α 𝛼 \alpha model. Asymptotic Analysis 97 (2016), 139–164.
2[2] M. Altaf, E. Titi, O. Knio, L. Zhao, M. Mc Cabe, and I. Hoteit: Downscaling the 2D Bénard Convection Equations Using Continuous Data Assimilation. Computational Geosciences (to appear, 2017).
3[3] A. Arnold, P. A. Markowich, G. Toscani, and A. Unterreiter: On convex Sobolev inequalities and the rate of convergence to equilibrium for Fokker-Planck type equations. Comm. PDE 26 (2001), 43–100.
4[4] N. Barton, and H. de Vladar: Statistical mechanics and the evolution of polygenic quantitative traits. Genetics 181 (2009), 997–1011.
5[5] W. Bialek, A. Cavagna, I. Giardina, T. Mora, E. Silvestri et al.: Statistical mechanics for natural flocks of birds. Proc. Natl. Acad. Sci. USA 109 (2012), 4786–4791.
6[6] K. Bod’ová, G. Tkacik, and N. Barton: A General Approximation for the Dynamics of Quantitative Traits. Genetics, Vol. 202 (2016), 1523–1548.
7[7] F. Chalub and M. Souza: From discrete to continuous evolution models: A unifying approach to drift-diffusion and replicator dynamics. Theoretical Population Biology 76 (2009), 268–277.
8[8] F. Chalub and M. Souza: A non-standard evolution problem arising in population genetics. Comm. Math. Sci. 7 (2009), 489–502.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Well posedness and Maximum Entropy Approximation for the Dynamics of Quantitative Traits

Abstract.

Contents

1. Introduction

2. Boundary conditions for the stationary problem

3. Existence and uniqueness of solutions

3.1. Formal calculations

3.2. Construction of solutions for the case 4Nμ≥1/24N\mu\geq 1/24Nμ≥1/2

Lemma 1**.**

Lemma 2**.**

Definition 1**.**

Lemma 3**.**

Theorem 1**.**

4. Spectral gap - exponential convergence to equilibrium

Lemma 4**.**

5. The Dynamical Maximum Entropy Approximation

5.1. Constrained entropy maximization

5.2. Derivation of the DynMaxEnt method

5.3. Scalar case

Lemma 5**.**

5.3.1. Solvability of the moment equation

6. Numerical experiments

6.1. Scalar case

6.2. Vector case

3.2. Construction of solutions for the case $4N\mu\geq 1/2$

Lemma 1.

Lemma 2.

Definition 1.

Lemma 3.

Theorem 1.

Lemma 4.

Lemma 5.