The Effect of Recombination on the Speed of Evolution

Nantawat Udomchatpitak

arXiv:1904.09922·math.PR·April 19, 2021

The Effect of Recombination on the Speed of Evolution

Nantawat Udomchatpitak

PDF

TL;DR

This paper models how recombination affects the speed of beneficial allele spread in a population, confirming that high recombination rates accelerate evolution, supporting Fisher and Muller's hypothesis.

Contribution

It provides an asymptotic analysis of how different recombination probabilities influence the time for beneficial alleles to fix in a population.

Findings

01

High recombination speeds up beneficial allele fixation.

02

Low recombination does not significantly affect fixation time.

03

Results support the advantage of sexual reproduction in evolution.

Abstract

It has been a puzzling question why some organisms reproduce sexually. Fisher and Muller hypothesized that reproducing by sex can speed up the evolution. They explained that in the sexual reproduction, recombination can combine beneficial alleles that lie on different chromosomes, which speeds up the time that those beneficial alleles spread to the entire population. We consider a population model of fixed size $N$ , in which we will focus on two loci on a chromosome. Each allele at each locus can mutate into a beneficial allele at rate $μ_{N}$ . The individuals with 0, 1, and 2 beneficial alleles die at rates $1, 1 - s_{N}$ and $1 - 2 s_{N}$ respectively. When an individual dies, with probability $1 - r_{N}$ , the new individual inherits both alleles from one parent, chosen at random from the population, while with probability $r_{N}$ , recombination occurs, and the new individual receives its two…

Equations1247

N \to \infty lim \frac{a _{N}}{b _{N}} = 0.

N \to \infty lim \frac{a _{N}}{b _{N}} = 0.

s_{N} ≪ 1,

s_{N} ≪ 1,

1 ≪ N μ_{N},

1 ≪ N μ_{N},

N μ_{N}^{2} ≪ s_{N},

N μ_{N}^{2} ≪ s_{N},

r_{N} ln_{+} (N r_{N}) ≪ s_{N},

r_{N} ln_{+} (N r_{N}) ≪ s_{N},

μ_{N} ≪ s_{N} .

μ_{N} ≪ s_{N} .

t^{*}_{N}(r)=\frac{1}{s_{N}}\ln\bigg{(}\frac{Ns_{N}^{3}}{\mu_{N}\cdot\max\{N\mu_{N}^{2},r\ln_{+}(Nr)\}}\bigg{)}.

t^{*}_{N}(r)=\frac{1}{s_{N}}\ln\bigg{(}\frac{Ns_{N}^{3}}{\mu_{N}\cdot\max\{N\mu_{N}^{2},r\ln_{+}(Nr)\}}\bigg{)}.

\lim_{N\rightarrow\infty}P\big{(}(1-\theta)t^{*}_{N}(r_{N})\leq T\leq(1+\theta)t^{*}_{N}(r_{N})\big{)}=1.

\lim_{N\rightarrow\infty}P\big{(}(1-\theta)t^{*}_{N}(r_{N})\leq T\leq(1+\theta)t^{*}_{N}(r_{N})\big{)}=1.

t^{*}_{N}(0)=\frac{1}{s_{N}}\ln\bigg{(}\frac{s_{N}^{3}}{\mu_{N}^{3}}\bigg{)}.

t^{*}_{N}(0)=\frac{1}{s_{N}}\ln\bigg{(}\frac{s_{N}^{3}}{\mu_{N}^{3}}\bigg{)}.

t^{*}_{N}(0)=\frac{1}{s_{N}}\ln\bigg{(}\frac{s_{N}^{3}}{\mu_{N}^{3}}\bigg{)}\geq t^{*}_{N}(r_{N})>\frac{1}{s_{N}}\ln\bigg{(}\frac{Ns_{N}^{2}}{\mu_{N}}\bigg{)}=\frac{2}{3}\cdot\frac{1}{s_{N}}\ln\bigg{(}\frac{s_{N}^{3}}{\mu_{N}^{3}}\bigg{)}+\frac{1}{s_{N}}\ln(N\mu_{N})>\frac{2}{3}t^{*}_{N}(0).

t^{*}_{N}(0)=\frac{1}{s_{N}}\ln\bigg{(}\frac{s_{N}^{3}}{\mu_{N}^{3}}\bigg{)}\geq t^{*}_{N}(r_{N})>\frac{1}{s_{N}}\ln\bigg{(}\frac{Ns_{N}^{2}}{\mu_{N}}\bigg{)}=\frac{2}{3}\cdot\frac{1}{s_{N}}\ln\bigg{(}\frac{s_{N}^{3}}{\mu_{N}^{3}}\bigg{)}+\frac{1}{s_{N}}\ln(N\mu_{N})>\frac{2}{3}t^{*}_{N}(0).

N μ \cdot s \cdot \frac{1}{s} ln (N) = N μ ln (N) .

N μ \cdot s \cdot \frac{1}{s} ln (N) = N μ ln (N) .

N μ_{N}^{2} ≪ r_{N} ln (N r_{N}) ≪ s_{N} .

N μ_{N}^{2} ≪ r_{N} ln (N r_{N}) ≪ s_{N} .

r_{N} ln_{+} (N r_{N}) \leq C N μ_{N}^{2} .

r_{N} ln_{+} (N r_{N}) \leq C N μ_{N}^{2} .

X_{i} (t) \approx \int_{0}^{t} N μ \cdot e^{s (t - u)} d u \approx \frac{N μ}{s} e^{s t} .

X_{i} (t) \approx \int_{0}^{t} N μ \cdot e^{s (t - u)} d u \approx \frac{N μ}{s} e^{s t} .

(1 - δ^{2}) e^{- C_{1}} N \leq X_{i} (t_{1}) \leq (1 + δ^{2}) e^{- C_{1}} N

(1 - δ^{2}) e^{- C_{1}} N \leq X_{i} (t_{1}) \leq (1 + δ^{2}) e^{- C_{1}} N

\frac{K _{1 r}^{-} N r ln ( N r )}{s} \leq X_{3} (t_{1}) \leq \frac{K _{1 r}^{+} N r ln ( N r )}{s} .

\frac{K _{1 r}^{-} N r ln ( N r )}{s} \leq X_{3} (t_{1}) \leq \frac{K _{1 r}^{+} N r ln ( N r )}{s} .

\frac{K _{1 m}^{-} N ^{2} μ ^{2}}{s} \leq X_{3} (t_{1}) \leq \frac{K _{1 m}^{+} N ^{2} μ ^{2}}{s} .

\frac{K _{1 m}^{-} N ^{2} μ ^{2}}{s} \leq X_{3} (t_{1}) \leq \frac{K _{1 m}^{+} N ^{2} μ ^{2}}{s} .

\tilde{X}_{i}(t)\approx\frac{1}{2}\bigg{(}\frac{1}{1+Be^{-s(t-t_{1})}}\bigg{)},

\tilde{X}_{i}(t)\approx\frac{1}{2}\bigg{(}\frac{1}{1+Be^{-s(t-t_{1})}}\bigg{)},

\Big{(}\frac{1}{2}-\frac{3\delta^{2}}{2}\Big{)}N\leq X_{i}(t_{2})\leq\Big{(}\frac{1}{2}-\frac{\delta^{4}}{4}\Big{)}N.

\Big{(}\frac{1}{2}-\frac{3\delta^{2}}{2}\Big{)}N\leq X_{i}(t_{2})\leq\Big{(}\frac{1}{2}-\frac{\delta^{4}}{4}\Big{)}N.

\frac{K _{2 r}^{-} N r ln ( N r )}{s} \leq X_{3} (t_{2}) \leq \frac{K _{2 r}^{+} N r ln ( N r )}{s} .

\frac{K _{2 r}^{-} N r ln ( N r )}{s} \leq X_{3} (t_{2}) \leq \frac{K _{2 r}^{+} N r ln ( N r )}{s} .

\frac{K _{2 m}^{-} N ^{2} μ ^{2}}{s} \leq X_{3} (t_{2}) \leq \frac{K _{2 m}^{+} N ^{2} μ ^{2}}{s} .

\frac{K _{2 m}^{-} N ^{2} μ ^{2}}{s} \leq X_{3} (t_{2}) \leq \frac{K _{2 m}^{+} N ^{2} μ ^{2}}{s} .

X_{0}(t_{3})<\begin{cases}\displaystyle{\delta e^{-(1-3\delta)(C_{3}-C_{2})}N\cdot\bigg{(}\frac{r\ln(Nr)}{s}\bigg{)}^{1-3\delta}}&\mbox{in the recombination dominating case}\\ \displaystyle{\delta e^{-(1-3\delta)(C_{3}-C_{2})}N\cdot\bigg{(}\frac{N\mu^{2}}{s}\bigg{)}^{1-3\delta}}&\mbox{in the mutation dominating case}.\end{cases}

X_{0}(t_{3})<\begin{cases}\displaystyle{\delta e^{-(1-3\delta)(C_{3}-C_{2})}N\cdot\bigg{(}\frac{r\ln(Nr)}{s}\bigg{)}^{1-3\delta}}&\mbox{in the recombination dominating case}\\ \displaystyle{\delta e^{-(1-3\delta)(C_{3}-C_{2})}N\cdot\bigg{(}\frac{N\mu^{2}}{s}\bigg{)}^{1-3\delta}}&\mbox{in the mutation dominating case}.\end{cases}

K_{3} N \leq X_{3} (t_{3}) \leq δ^{2} N .

K_{3} N \leq X_{3} (t_{3}) \leq δ^{2} N .

\bigg{(}1-\frac{5\delta^{2}}{4}\bigg{)}N\leq X_{3}(t_{4})\leq\bigg{(}1-\frac{3K_{3}}{4}\bigg{)}N,

\bigg{(}1-\frac{5\delta^{2}}{4}\bigg{)}N\leq X_{3}(t_{4})\leq\bigg{(}1-\frac{3K_{3}}{4}\bigg{)}N,

X_{1} (t_{4}) + X_{2} (t_{4}) \geq \frac{K _{3} N}{2} .

X_{1} (t_{4}) + X_{2} (t_{4}) \geq \frac{K _{3} N}{2} .

1 ≪ N r .

1 ≪ N r .

r ≪ N μ^{2} .

r ≪ N μ^{2} .

r ≪ s,

r ≪ s,

\frac{r}{s} ln (N s) ≪ 1,

\frac{r}{s} ln (N s) ≪ 1,

\frac{r}{s}\ln\bigg{(}\frac{s}{\mu}\bigg{)}\ll 1.\

\frac{r}{s}\ln\bigg{(}\frac{s}{\mu}\bigg{)}\ll 1.\

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

The Effect of Recombination on the Speed of Evolution

Nantawat Udomchatpitak111The author is supported in part by NSF grant DMS-1707953.

Abstract

It has been a puzzling question why some organisms reproduce sexually. Fisher and Muller hypothesized that reproducing by sex can speed up the evolution. They explained that in the sexual reproduction, recombination can combine beneficial alleles that lie on different chromosomes, which speeds up the time that those beneficial alleles spread to the entire population. We consider a population model of fixed size $N$ , in which we will focus on two loci on a chromosome. Each allele at each locus can mutate into a beneficial allele at rate $\mu_{N}$ . The individuals with 0, 1, and 2 beneficial alleles die at rates $1,1-s_{N}$ and $1-2s_{N}$ respectively. When an individual dies, with probability $1-r_{N}$ , the new individual inherits both alleles from one parent, chosen at random from the population, while with probability $r_{N}$ , recombination occurs, and the new individual receives its two alleles from different parents. Under certain assumptions on the parameters $N,\mu_{N},s_{N}$ and $r_{N}$ , we obtain an asymptotic approximation for the time that both beneficial alleles spread to the entire population. When the recombination probability is small, we show that recombination does not speed up the time that the two beneficial alleles spread to the entire population, while when the recombination probability is large, we show that recombination decreases the time, which agrees with Fisher-Muller hypothesis, and confirms the advantage of reproducing by sex.

Keywords. beneficial mutations, evolution, fixation time, recombination, selection

AMS 2010 subject classifications. Primary 92D15; Secondary 60J27, 60J75, 60J85

1 Introduction

It has been a puzzle in evolutionary biology why many organisms reproduce sexually. Sexually reproducing parents transmit just half of their genes to the offspring, which means that all beneficial alleles that the parent has might not be fully transmitted to the offspring. This does not happen to parents who reproduce asexually, since they transmit all their genes to the offspring. An advantage of sexual reproduction might come from recombination, which can combine portions of different chromosomes together. Fisher [8] and Muller [10] hypothesized that sexual reproduction can speed up the evolution. They explained that in an asexual population, for two beneficial mutations to survive, the second beneficial mutation has to occur in an individual that already has the first beneficial mutation, while in a sexually reproducing population, both beneficial mutations might occur on different individuals and recombination can later combine both mutations, which leads to an evolutionary advantage over asexual reproduction.

1.1 The model

We consider a population of fixed size $N$ consisting of $N$ chromosomes, which come from $N/2$ organisms of the same species. We are interested in two loci on the chromosome. One of the two loci contains either an $a$ or $A$ allele, and another locus contains either a $b$ or $B$ allele. Both the $A$ and $B$ alleles are beneficial. At time 0, all individuals have $a$ and $b$ alleles. Independently, each $a$ allele mutates to $A$ at exponential rate $\mu_{N}$ , and each $b$ allele mutates to $B$ at exponential rate $\mu_{N}$ . Individuals with 0, 1 and 2 beneficial alleles will die independently at exponential rates $1,1-s_{N}$ and $1-2s_{N}$ , respectively. A new individual is created immediately to replace the individual who dies, in order to keep the population size fixed. With probability $1-r_{N}$ , no recombination occurs, in which case the new individual receives both alleles from a randomly chosen individual in the population at that time. With probability $r_{N}$ , recombination occurs, in which case the new individual receives the $a/A$ allele from a randomly chosen individual, and receives the $b/B$ allele from another independently randomly chosen individual. We will give an approximation for the first time that all individuals in the population have both beneficial alleles, when the population size is large. The result shows that this time is shorter when $r_{N}$ is large, consistent with the Fisher-Muller hypothesis.

1.2 Previous works

Takahata [14] considered a model of a population of finite size, where each individual consists of one chromosome. This model focuses on two loci on the chromosome. One locus contains either an $a$ or $A$ allele, and another locus contains either a $b$ and $B$ allele. The fitnesses of individuals of types $ab,Ab,aB$ and $AB$ are assumed to be $1,1+s,1+s$ and $1+t$ respectively. The model also assumed recurrent mutations from $a$ to $A$ and from $b$ to $B$ , which means that mutations will never be exhausted. In the beginning, the frequency of type $ab$ is assumed to be 1. Via simulation, the numerical fixation time of both $A$ and $B$ is given for some values of $s$ and $t$ in the following parameter regimes: 1) $t=s=0$ , 2) $t=2s>0$ , 3) $t=2s<0$ , 4) $t>2s>0$ , and 5) $t>0>s$ .

Some non-rigourous works discuss the benefits of recombination. Crow and Kimura [4] argued that in large populations, sexual reproduction can incorporate more mutations due to recombination than asexual reproduction can. Several works pursued finding the relation between the speed of adaptation and the recombination rate. Neher, Kessinger, and Shraiman [11] considered a linear chromosome model assuming a large mutation rate and a weak selective effect. They obtained that the rate of adaptation is proportional to the square root of the recombination rate. Weissman and Barton [15] considered the regime where the mutation rate is small, and they obtained that the rate of adaptation is proportional to the recombination rate. Weissman and Hallatschek [16] considered the intermediate mutation rate regime and obtained that the rate of adaptation is proportional to the recombination rate. Lastly, Neher, Shraiman, and Fisher [12] considered a population model, where a large number of loci was considered. The recombination mechanism in this model is different from the other works mentions before. Under the assumptions that the selective advantage is weak and the recombination rate is much larger than the selective advantage, they obtained that in large populations, the rate of adaptation increases as the square of recombination rate.

We will now discuss some rigourous results. Cuthbertson, Etheridge, and Yu [5] considered a two loci model with finite population size $N$ . Each individual can be one of the four possible types: $ab,Ab,aB$ and $AB$ . Both $A$ and $B$ are considered to be beneficial, and they increase the fitness by $s_{1}$ and $s_{2}$ respectively, with the assumption that $s_{1}<s_{2}$ . The mutation from $b$ to $B$ randomly occurs during the the time interval that $Ab$ is spreading in the population, and it appears as a type $aB$ . For both $A$ and $B$ to spread to the entire population, there are three requirements. First, the number of type $aB$ should become significant. Second, recombination between $A$ and $B$ must occur. Lastly, the number of type $AB$ should become significant, after which $AB$ is almost certain to fixate. The result shows that the fixation probability of $AB$ can be approximated by the solution to a specific system of ODEs.

Bossert and Pfaffelhuber [3] considered a diffusion model with 4 types: $ab,Ab,aB$ and $AB$ , where the fitnesses of $ab,Ab,aB$ and $AB$ are in increasing order. The frequencies of these four types evolve according to a system of SDEs. In the beginning, the frequencies of types $Ab$ and $aB$ are assumed to be small, and there is no type $AB$ yet. They obtain approximate formulas for the fixation probability and fixation time of type $AB$ .

Both Cuthbertson, Etheridge, and Yu [5] and Bossert and Pfaffelhuber [3] assume that at least one beneficial mutation is present at the beginning, and they do not allow an unlimited supply of new mutations. In the model studied in this paper, we assume that all individuals in the beginning do not have any beneficial mutations, and both beneficial mutations occur according to a Poisson process. This model is similar to the model given by Takahata in the case $t=2s>0$ , but with finite population size.

Lastly, we mention another work by Berestycki and Zhao [2]. In their model, which involves branching Brownian motion in two dimensions, they showed that the fitnesses on two loci are negative correlated. They explained that recombination can reduce this negative correlation, and leads to a fitter population.

1.3 Conditions of the parameters

There are four parameters in our model: $N,\mu_{N},r_{N}$ and $s_{N}$ . We assume that $\mu_{N}\in(0,1),s_{N}\in(0,1/2]$ and $r_{N}\in[0,1)$ . For any two sequences $a_{N}$ and $b_{N}$ , we say that $a_{N}\ll b_{N}$ if

[TABLE]

We will assume that $\mu_{N}$ and $s_{N}$ satisfy the following conditions:

[TABLE]

and

[TABLE]

where $\ln_{+}(x)$ is defined to be $\ln(x)$ if $x\in(1,\infty)$ , and 0 if $x\in[0,1]$ . Note that (2) and (3) imply that

[TABLE]

1.4 Main theorem

Theorem 1.

Let $T$ be the first time that all individuals in the population are type $AB$ , which we also call the fixation time of $AB$ . For every positive integer $N$ , and $r\in[0,1]$ , we define

[TABLE]

Then, for every $\theta\in(0,1)$ , we have that

[TABLE]

This theorem suggests that the time that both beneficial alleles spread to the entire population is approximately $t^{*}_{N}(r_{N})$ , when $N$ is large. From (5), when there is no recombination,

[TABLE]

When $r_{N}\ln_{+}(Nr_{N})>N\mu_{N}^{2}$ , we observe that $t^{*}_{N}(r_{N})<t^{*}_{N}(0)$ . This means that when $r_{N}$ is large enough, it decreases the fixation time of $AB$ , compared with when there is no recombination. From (3) and (4), for sufficiently large $N$ , we have that $\max\{N\mu_{N}^{2},r_{N}\ln_{+}(Nr_{N})\}<s_{N}$ , and

[TABLE]

This implies that under our assumptions, which assume small recombination rates, in large populations, recombination can decrease the fixation time of $AB$ by no more than a factor of one-third.

Lastly, we will show that these assumptions on the parameters are attainable. We consider when $\mu_{N}=N^{-a},r_{N}=N^{-b}$ and $s_{N}=N^{-c}$ for some positive numbers $a,b$ and $c$ . One can check that (1), (2), (3) and (4) are equivalent to $0<c<b$ and $(1+c)/2<a<1$ .

2 Overview of the proof

From now on, we will refer to an individual with $ab$ , $Ab$ , $aB$ , and $AB$ as type 0, 1, 2, and 3 respectively, and we will omit writing the subscript $N$ in $\mu_{N},s_{N}$ and $r_{N}$ . For $i=0,1,2,3$ and $t\geq 0$ , we define $X_{i}(t)$ as the number of type $i$ individuals at time $t$ and define $\tilde{X}_{i}(t)=X_{i}(t)/N$ , which is the fraction of type $i$ individuals in the population at time $t$ .

Before we consider the behavior of the process $((X_{0}(t),X_{1}(t),X_{2}(t),X_{3}(t)),t\geq 0)$ , we will first look at the condition $1\ll N\mu$ . Intuitively, we don’t want the mutations to occur too slowly, so that we see one beneficial mutation spread to the entire population, before any other mutations take hold. The process by which a beneficial allele spreads to the entire population is also known as a selective sweep. Suppose that a mutation from $a$ to $A$ is the first to occur, and assume that it doesn’t go extinct. It will take time about $\frac{2}{s}\ln(N)$ to complete its selctive sweep (see section 6.1 of [7]). During this time, a mutation from $b$ to $B$ occurs at total rate of $N\mu$ . The number of descendants of one of these new mutations can be approximated by an asymmetric random walk. So, the chance that each of these mutations survives is about $s$ . Hence, the number of mutations to $B$ that survive during the selective sweep of $A$ is approximately

[TABLE]

So, if $N\mu\ln(N)\ll 1$ , then there is no $B$ that survives during the sweep of $A$ . Hence, we will see $A$ spread to the entire population first, before $B$ appears and spreads. In this case, recombination does not speed up the time needed for the type $AB$ to take hold in the population. So, we should consider when $N\mu\ln(N)\gg 1$ . Here, we make a slightly stronger assumption that $N\mu\gg 1$ .

Now, we will consider our process $((X_{0}(t),X_{1}(t),X_{2}(t),X_{3}(t)),t\geq 0)$ . The behavior of our process is essentially reduced to two cases. For the first case, which we will call the recombination dominating case, we assume that

[TABLE]

For the second case, which we will call the mutation dominating case, we assume that there is a positive constant $C$ such that for sufficiently large $N$ ,

[TABLE]

The reason for these names is that in the recombination dominating case, type 3 individuals start to appear from recombination between $A$ alleles from type 1 individuals and $B$ alleles from type 2 individuals, while in the mutation dominating case, the type 3 individuals start to appear from mutations from type 1 and type 2 individuals.

In the following table, we define times when we see significant changes in the behavior of the process.

{TAB}

[5pt]—c—l—l——c—c—c—c—c—c—Time & recombination dominating mutation dominating

$t_{0}$ $\displaystyle{\frac{1}{s}\ln\bigg{(}\frac{s}{\mu\sqrt{Nr}}\bigg{)}-\frac{C_{0,r}}{s}}$ $\displaystyle{\frac{1}{s}\ln\bigg{(}\frac{s}{N\mu^{2}}\bigg{)}-\frac{C_{0,m}}{s}}$

$t_{1}$ $\displaystyle{\frac{1}{s}\ln\bigg{(}\frac{s}{\mu}\bigg{)}-\frac{C_{1}}{s}}$ $\displaystyle{\frac{1}{s}\ln\bigg{(}\frac{s}{\mu}\bigg{)}-\frac{C_{1}}{s}}$

$t_{2}$ $\displaystyle{\frac{1}{s}\ln\bigg{(}\frac{s}{\mu}\bigg{)}+\frac{C_{2}}{s}}$ $\displaystyle{\frac{1}{s}\ln\bigg{(}\frac{s}{\mu}\bigg{)}+\frac{C_{2}}{s}}$

$t_{3}$ $\displaystyle{\frac{1}{s}\ln\bigg{(}\frac{s^{2}}{\mu r\ln(Nr)}\bigg{)}+\frac{C_{3}}{s}}$ $\displaystyle{\frac{1}{s}\ln\bigg{(}\frac{s^{2}}{N\mu^{3}}\bigg{)}+\frac{C_{3}}{s}}$

$t_{4}$ $\displaystyle{\frac{1}{s}\ln\bigg{(}\frac{s^{2}}{\mu r\ln(Nr)}\bigg{)}+\frac{C_{4}}{s}}$ $\displaystyle{\frac{1}{s}\ln\bigg{(}\frac{s^{2}}{N\mu^{3}}\bigg{)}+\frac{C_{4}}{s}}$

The constants $C_{0,r},C_{0,m},C_{1},C_{2},C_{3}$ , and $C_{4}$ are defined in (64), (62), (61), (145), (167), and (200). The reader does not need to know what these constants are exactly at this point, but should notice that $C_{i}/s$ is the lower order term in the definition of the $t_{i}$ . From now on, all statements are assumed to be true in both the recombination dominating case and the mutation dominating case, unless specified otherwise.

Overall, the behavior of the numbers of type 1, 2 and 3 are similar in the sense that they first grow exponentially, then grow logistically. Both types 1 and 2 grow simultaneously, but type 3 will start to grow later, due to the late appearance of type 3 individuals. The behavior of the process is split into five time intervals, which will be discussed below. During the time interval $[0,t_{1}]$ , which we will call phase 1, most individuals are type 0. The type 1 and type 2 individuals appear from mutations from type 0 individuals. Since type 1 and type 2 individuals die at rate $1-s$ , while the majority of the population, which is type 0, dies at rate 1, the numbers of descendants of these type 1 and 2 ancestors grow exponentially at rate approximately $s$ . Since the total rate of mutation from type 0 to type 1 is approximately $N\mu$ , we have

[TABLE]

The type 3 individuals appear around time $t_{0}$ . From this time, the number of type 3 individuals will grow exponentially at rate about $2s$ , due to the fact that each type 3 individual dies at rate $1-2s$ , while most individuals in the population die at rate 1. The following proposition describes the process at time $t_{1}$ .

Proposition 2.

For $\epsilon>0$ and $\delta\in(0,1)$ , there is an event $A_{(1)}$ , such that for sufficiently large $N$ , we have that $P(A_{(1)})\geq 1-17\epsilon$ , and the following statements hold:

On the event $A_{(1)}$ , when $N$ is sufficiently large, for $i=1,2$ ,

[TABLE] 2. 2.

In the recombination dominating case, on the event $A_{(1)}$ , there are positive constants $K^{+}_{1r}$ and $K^{-}_{1r}$ such that for sufficiently large $N$ ,

[TABLE] 3. 3.

In the mutation dominating case, on the event $A_{(1)}$ , there are positive constants $K^{+}_{1m}$ and $K^{-}_{1m}$ such that for sufficiently large $N$ ,

[TABLE]

This proposition says that when $N$ is sufficiently large, at time $t_{1}$ , both type 1 and type 2 have established themselves in the population by having their numbers reaching the level of order $N$ . However, $\tilde{X}_{3}(t_{1})$ is only of order $r\ln(Nr)/s$ in the recombination dominating case, and is only of order $N\mu^{2}/s$ in the mutation dominating case, which from (3) and (4), means that number of type 3 at time $t_{1}$ is not yet comparable to those of type 1 and 2.

During the time interval $[t_{1},t_{2}]$ , which we will call phase 2, the numbers of type 1 and 2 now grow logistically, or more precisely,

[TABLE]

for $i=1,2$ , where $B$ is some positive constant. The following proposition describes the process at time $t_{2}$ .

Proposition 3.

For $\epsilon>0$ and $\delta\in(0,1)$ , there is an event $A_{(2)}$ , such that for sufficiently large $N$ , we have that $P(A_{(2)})\geq 1-21\epsilon$ , and the following statements hold:

On the event $A_{(2)}$ , for sufficiently large $N$ , for $i=1,2$ ,

[TABLE] 2. 2.

In the recombination dominating case, on the event $A_{(2)}$ , there are positive constants $K^{+}_{2r}$ and $K^{-}_{2r}$ such that for sufficiently large $N$ ,

[TABLE] 3. 3.

In the mutation dominating case, on the event $A_{(2)}$ , there are positive constants $K^{+}_{2m}$ and $K^{-}_{2m}$ such that for sufficiently large $N$ ,

[TABLE]

This proposition says that at time $t_{2}$ , almost half of the population becomes type 1, and almost the other half becomes type 2, while the number of type 3 individuals doesn’t change much from time $t_{1}$ .

During the time interval $[t_{2},t_{3}]$ , which we will call phase 3, the majority of the population has become type 1 or type 2. The number of type 3 individuals continues to grow exponentially from time $t_{2}$ . However, since the majority of the population dies at rate $1-s$ , and a type 3 individual dies at rate $1-2s$ , the type 3 population grows exponentially at approximately rate $s$ . The following proposition describes the behavior of the process at time $t_{3}$ .

Proposition 4.

For $\epsilon>0$ and $\delta\in(0,1)$ , there is an event $A_{(3)}$ , such that for sufficiently large $N$ , we have that $P(A_{(3)})\geq 1-25\epsilon-7\delta-\delta^{2}$ , and the following statements hold:

For sufficiently large $N$ , on the event $A_{(3)}$ , we have

[TABLE] 2. 2.

In both cases, there is a positive constant $K_{3}$ such that for sufficiently large $N$ , on the event $A_{(3)}$ , we have

[TABLE]

This proposition says that by the time $t_{3}$ , the number of type 3 individuals has reached order $N$ . Moreover, from (3) and (4), there are almost no type 0 individuals left by time $t_{3}$ .

During the time interval $[t_{3},t_{4}]$ , which we will call phase 4, the number of type 3 individuals grows logistically. The following proposition describes the behavior of the process at time $t_{4}$ .

Proposition 5.

For $\epsilon>0$ and $\delta\in(0,1)$ , there is an event $A_{(4)}$ , such that for sufficiently large $N$ , we have that $P(A_{(4)})\geq 1-26\epsilon-7\delta-\delta^{2}$ , and on the event $A_{(4)}$ ,

[TABLE]

and

[TABLE]

This proposition implies that by time $t_{4}$ , almost all individuals have become type 3, and only small fractions of type 1 and 2 individuals remain in the population.

After time $t_{4}$ , which we will call phase 5, the number of individuals that are not type 3 can be approximated by a subcritical branching process. The non-type 3 population is heading toward extinction, and type 3 becomes fixated in the population. The fixation of type 3 will occur around time $t^{*}_{N}(r_{N})$ .

In section 3, we will discuss about transition rates of the process. In section 4, we construct martingales and submartigales, and give expectation and variance formulas. They will be used in the proofs of phases 1, 2, and 3 in sections 5, 6, and 7. In section 5, we will prove several lemmas on the process during phase 1, and at the end of the section, we give the proof of Proposition 2. Proposition 3, 4 and 5 will be proved in sections 6, 7, and 8 respectively. Finally, the proof of Theorem 1 will be given at the end of section 9.

3 On parameters and transition rates of the process

3.1 More inequalities on the parameters

Lemma 6.

The following statements hold.

In the recombination dominating case,

[TABLE] 2. 2.

In the mutation dominating case,

[TABLE] 3. 3.

In both cases,

[TABLE]

and

[TABLE]

Proof.

We will first prove statement 1. In the recombination dominating case, from conditions (2) and (6),

[TABLE]

which implies that $1\ll Nr$ .

Now, we will prove statement 2 by contradiction. Suppose there is a $c>0$ and an increasing sequence $\{N_{k}\}_{k=1}^{\infty}$ of natural numbers such that for all $k=1,2,3,...$ , we have

[TABLE]

From (7), we have that for all $k=1,2,3,..$ ,

[TABLE]

This leads to a contradiction, since $1\ll N\mu$ implies that

[TABLE]

as $k\rightarrow\infty$ .

Lastly, we will prove statement 3. First, we will consider the recombination dominating case. By (4) and (11),

[TABLE]

From (6) and (12), it follows that

[TABLE]

and because of (2), for sufficiently large $N$ ,

[TABLE]

which implies (14). For the mutation dominating case, we define $r^{*}_{N}$ such that $Nr^{*}_{N}$ is the solution of

[TABLE]

It follows that $N\mu^{2}\ll r^{*}_{N}\ln(Nr^{*}_{N})\ll s$ . Therefore, by the same argument above,

[TABLE]

and

[TABLE]

Also, from (7) and the fact that $N\mu^{2}\ll r^{*}_{N}\ln(Nr^{*}_{N})$ , for sufficiently large $N$ , we have $r_{N}\leq r^{*}_{N}$ . This fact along with (15), (16) and (17) imply (12), (13) and (14). ∎

3.2 Transition rates of the process

For the proof, we need to separate type 1 individuals into two groups: one that comes from mutation from type 0 individuals and another that comes from recombination between type 0 and type 3 individuals. We need to do the same for the other three types. The precise definitions are given below.

A type 1 (or 2) individual is called a type 1m (or 2m) ancestor, if it appears by mutation from a type 0 individual. 2. 2.

A type 1 (or 2) individual is called a type 1r (or 2r) ancestor, if it appears by recombination between a $b$ (or an $a$ ) allele from a type 0 individual and an $A$ (or a $B$ ) allele from a type 3 individual. 3. 3.

A type 1 individual $x$ is called an offspring of another type 1 individual $y$ if

•

$x$ receives the $A$ allele from $y$ , or

•

$x$ receives the $b$ allele from $y$ and receives the $A$ allele from a type 3. 4. 4.

A type 2 individual $x$ is called an offspring of another type 2 individual $y$ if a

•

$x$ receives the $B$ allele from $y$ , or

•

$x$ receives the $a$ allele from $y$ and receives the $B$ allele from a type 3. 5. 5.

A type 1 (or 2) individual is called type 1m (or 2m), if it descends from a type 1m (or 2m) ancestor. A type 1 (or 2) individual is called type 1r (or 2r), if it descends from a type 1r (or 2r) ancestor. 6. 6.

A type 3 individual is called a type 3m ancestor, if it appears from mutation from a type 1 individual or a type 2 individual. 7. 7.

A type 3 individual is called a type 3r ancestor, if it appears by recombination between an $A$ allele from a type 1 individual and a $B$ allele from a type 2 individual. 8. 8.

A type 3 individual $x$ is called an offspring of another type 3 individual $y$ if

•

$x$ receives the $A$ allele from $y$ , or

•

$x$ receives the $B$ allele from $y$ and receives the $A$ allele from a type 1 individual. 9. 9.

A type 3 individual is called type 3m, if it descends from a type 3m ancestor. A type 3 individual is called type 3r, if it descends from a type 3r ancestor. 10. 10.

A type 0 individual is called a type 0r ancestor, if it appears from recombination between an $a$ allele from a type 1 individual and a $b$ allele from a type 2 individual. 11. 11.

A type 0 individual $x$ is called an offspring of another type 0 individual $y$ if

•

$x$ receives the $a$ allele from $y$ , or

•

$x$ receives the $b$ allele from $y$ and receives the $a$ allele from a type 2. 12. 12.

A type 0 individual is called a type 0r if it descends from a type 0r ancestor.

For $i=1,2,3$ , we define $X_{im}(t)$ as the number of type $im$ at time $t$ , and for $i=0,1,2,3$ , we define $X_{ir}(t)$ as the number of type $ir$ at time $t$ . Note that for $i=1,2,3$ and $t\geq 0$ , we have $X_{i}(t)=X_{im}(t)+X_{ir}(t)$ . Next, we define $X^{(a,b]}_{im}(t)$ and $X^{(a,b]}_{ir}(t)$ to be the number of type $im$ and $ir$ individuals at time $t$ , whose ancestor appears in the time interval $(a,b]$ . It follows that if $0\leq t\leq b$ , for $i=1,2,3$ , we have that $X_{im}^{(0,b]}(t)=X_{im}(t)$ , and for $i=0,1,2,3$ , we have that $X_{ir}^{(0,b]}(t)=X_{ir}(t)$ . We will call an individual type im(a,b] (or ir(a,b]), if it is of type im (or type ir) and its ancestor appears in the time interval $(a,b]$ . Lastly, we define $\tilde{X}_{im}(t),\tilde{X}_{ir}(t),\tilde{X}^{(a,b]}_{im}(t)$ , and $\tilde{X}^{(a,b]}_{ir}(t)$ to be the fractions of type im, ir, im(a,b] and ir(ab] in the population at time $t$ respectively.

Now, consider the process $(X^{(a,b]}_{1m}(t),t\geq 0)$ , First, we consider the rate that $X^{(a,b]}_{1m}(t)$ increases by 1. There are two ways to increase $X^{(a,b]}_{1m}(t)$ . First, a type 0 individual can mutate to a type 1 individual during the time interval $(a,b]$ , creating a type 1m(a,b] ancestor, which occurs at total rate

[TABLE]

Second, an individual that is not of type 1m(a,b] can die, which occurs at total rate

[TABLE]

and the new individual must be a type 1m(a,b]. The probability that recombination doesn’t occur and the new individual has type 1m(a,b] is $(1-r)\tilde{X}^{(a,b]}_{1m}(t)$ . If recombination occurs, the new individual can come from combining an $A$ allele from a type 1m(a,b] individual with a $b$ allele from a type 0 or 1 individual, or combining an $A$ allele from a type 3 individual with a $b$ allele from a type 1m(a,b] individual. (Note that recombination between an $A$ allele from a type 3 individual and a $b$ alelle from a type 0 individual creates an ancestor of type 1r.) So, the probability that recombination occurs and the new individual has type 1m(a,b] is

[TABLE]

Hence, the total rate that the number of descendants of type 1m(a,b] increases by 1 is

[TABLE]

Let us define

[TABLE]

and note that $X^{(a,b]}_{1m}(t)$ increases by 1 at rate $M_{1}^{(a,b]}(t)+B_{1m}^{(a,b]}(t)X_{1m}^{(a,b]}(t)$ .

Similarly, the rate that the number of type 1m(a,b] individuals decreases by 1 is given by

[TABLE]

where $(1-s)X_{1m}^{(a,b]}(t)$ is the total rate that type 1m(a,b] individuals die at time $t$ ,

[TABLE]

is the probability that we don’t create a type 1m(a,b] individual, and $\mu X_{1m}^{(a,b]}(t)$ corresponds to the total rate that type 1m(a,b] mutates to type 3. We define

[TABLE]

and note that the number of type 1m(a,b] individuals decreases by 1 at rate $D_{1m}^{(a,b]}(t)X_{1m}^{(a,b]}(t)$ .

We will now consider the process $(X^{(a,b]}_{1r}(t),t\geq 0)$ . We will first consider the rate that $X^{(a,b]}_{1r}(t)$ increases by 1. There are two ways to increase $X^{(a,b]}_{1r}(t)$ by 1. First, an individual that is not of type 1r(a,b] dies, and the recombination between an $A$ allele from a type 3 individual and a $b$ allele from a type 0 individual occurs during the time interval $(a,b]$ , which creates a type 1r(a,b] ancestor. This occurs at total rate of

[TABLE]

Second, an individuals that is not of type 1r(a,b] dies, and a new type 1r(a,b] individual is born from the type 1r(a,b] individuals at that time. Similar to the way we obtain (19) and (20), by defining

[TABLE]

one can see that the rate that $X^{(a,b]}_{1r}(t)$ increases by 1 is $R_{1}^{(a,b]}(t)+B_{1r}^{(a,b]}(t)X^{(a,b]}_{1r}(t)$ .

We will now consider the rate that $X^{(a,b]}_{1r}(t)$ decreases by 1. One way that $X^{(a,b]}_{1r}(t)$ decreases by 1 is when a type 1r(a,b] individual dies and the new individual is not of type 1r(a,b] (i.e, the new individual is not born from a type 1r(a,b] individual, and it is not a type 1r(a,b] ancestor). Another way is when a type $1r(a,b]$ individual mutates to a type 3 individual. By the same reason we used to obtain (22), the rate that $X^{(a,b]}_{1r}(t)$ decreases by 1 is

[TABLE]

and note that the term $r\tilde{X}_{0}(t)\tilde{X}_{3}(t)1_{(a,b]}(t)$ is precisely the probability that a type 1r(a,b] ancestor is created. By defining

[TABLE]

one can see that the rate that $X^{(a,b]}_{1r}(t)$ decreases by 1 is $D_{1r}^{(a,b]}(t)X_{1r}^{(a,b]}(t)$ .

Now, we define

[TABLE]

By analogy, one can check that for $i=2,3$ , we have that $X_{im}^{(a,b]}(t)$ increases by 1 at rate $M_{i}^{(a,b]}(t)+B_{im}^{(a,b]}(t)X_{im}^{(a,b]}(t)$ and decreases by 1 at rate $D_{im}^{(a,b]}(t)X_{im}^{(a,b]}(t)$ . Also, for $i=0,2$ and 3, $X_{ir}^{(a,b]}(t)$ increases by 1 at rate $R_{i}^{(a,b]}(t)+B_{ir}^{(a,b]}(t)X_{ir}^{(a,b]}(t)$ and decreases by 1 at rate $D_{ir}^{(a,b]}(t)X_{ir}^{(a,b]}(t)$ .

For $i=1,2,3$ , and $0\leq a<b\wedge t$ , we define $G_{i}(t)=B_{im}^{(a,b]}(t)-D_{im}^{(a,b]}(t)$ , which is the growth rate of the type im(a,b] population at time $t$ . For $i=0,1,2,3$ , and $0\leq a<b\wedge t$ , we define $G_{ir}^{(a,b]}(t)=B_{ir}^{(a,b]}(t)-D_{ir}^{(a,b]}(t)$ . This is the growth rate of the type ir(a,b] population at time $t$ . Note that $G_{i}(t)$ does not depended on the interval $(a,b]$ , because from (21), (23), and the fact that $\tilde{X}_{0}(t)+\tilde{X}_{1}(t)+\tilde{X}_{2}(t)+\tilde{X}_{3}(t)=1$ ,

[TABLE]

Similarly, we have

[TABLE]

Also, by similar calculation, we have

[TABLE]

From the fact that $\tilde{X}_{0}(t)+\tilde{X}_{1}(t)+\tilde{X}_{2}(t)+\tilde{X}_{3}(t)=N$ , and $s\ll 1$ , it follows that for sufficiently large $N$ ,

[TABLE]

Lastly, for $i=0,1,2,3$ and $0\leq a\leq t$ , we define $X_{i}^{[a]}(t)$ to be the number of type $i$ individuals at time $t$ that descend from one of the type $i$ individuals at time $a$ . It follows that for $0\leq a\leq t\leq b$ and $i=1,2,3$ ,

[TABLE]

and

[TABLE]

Following the argument we used to obtain $B_{im}^{(a,b]}(t)$ and $D_{im}^{(a,b]}(t)$ , for $0\leq a\leq t$ , we define

[TABLE]

and note that for $i=0,1,2,3$ , the process $\big{(}X_{i}^{[a]}(t),t\geq a)$ increases by 1 at rate $B_{i}^{[a]}(t)X_{i}^{[a]}(t)$ , and decreases by 1 at rate $D_{i}^{[a]}(t)X_{i}^{[a]}(t)$ . Also, for all $t\geq a$ and $i=1,2,3$ , we can check that

[TABLE]

Lastly, we define $G_{0}(t)=B_{0}^{[a]}(t)-D_{0}^{[a]}(t)$ for all $t\geq a$ . It follows that

[TABLE]

and note that from (37),

[TABLE]

4 Important Martingales and Submartingales

In this section, we will define several martingales and submartingales that will be used frequently in the proof. First, for $i=1,2,3$ and for $0\leq a<b$ , when $0\leq t<a$ , we define $Z_{im}^{(a,b]}(t)=0$ , and when $0\leq a<t$ , we define

[TABLE]

Also, for $i=0,1,2,3$ and for $0\leq a<b$ , when $0\leq t<a$ , we define $Z_{ir}^{(a,b]}(t)=0$ , and when $0\leq a<t$ , we define

[TABLE]

It follows that for $t\geq a$ ,

[TABLE]

Let $(\mathcal{F}_{t})_{t\geq 0}$ be the natural filtration of the process $((X_{0}(t),X_{1}(t),X_{2}(t),X_{3}(t)),t\geq 0)$ .

Proposition 7.

For $i=1,2,3$ , the process $(Z_{im}^{(a,b]}(t),t\geq a)$ is a mean-zero martingale, and for $a\leq t$ ,

[TABLE]

Also, For $i=0,1,2,3$ the process $(Z_{ir}^{(a,b]}(t),t\geq a)$ is a mean-zero martingale, and for $a\leq t$ ,

[TABLE]

Moreover, if $T$ is a stopping time and $T\geq a$ , then for $i=1,2,3$ , the process $(Z_{im}^{(a,b]}(t\wedge T),t\geq a)$ is a mean-zero martingale, and for $a\leq t$ ,

[TABLE]

Also, for $i=0,1,2,3$ , the process $(Z_{ir}^{(a,b]}(t\wedge T),t\geq a)$ is a mean-zero martingale, and for $a\leq t$ ,

[TABLE]

Proof.

The technique used in this proof was previously used in section 5.1 of [13]. We will prove the result for the process $(Z^{(a,b]}_{1m}(t),t\geq 0)$ . The results for the other processes can be proved in the same manner.

For $t\geq a$ , let $U(t)$ be the number of times in $[a,t]$ that the number of type $1m(a,b]$ individuals increases, and let $V(t)$ be the number of times in $[a,t]$ that the number of type $1m(a,b]$ individuals decreases. Then, $X^{(a,b]}_{1m}(t)=U(t)-V(t)$ . Next, we define

[TABLE]

and $W(t)=W_{+}(t)-W_{-}(t)$ , for all $t\geq a$ . Because $M_{1}^{(a,b]}(u)+B^{(a,b]}_{1m}(u)X^{(a,b]}_{1m}(u)$ and $D^{(a,b]}_{1m}(u)X^{(a,b]}_{1m}(u)$ are exactly the rates that the process $(X_{1m}^{(a,b]}(t),t\geq a)$ increases and decreases by 1 at time $u$ , and both $U(a)$ and $V(a)$ are 0, both the process $(W_{+}(t),t\geq a)$ and the process $(W_{-}(t),t\geq a)$ are mean-zero martingales. It follows that the processes $(W(t),t\geq a)$ and $(W_{+}(t)+W_{-}(t),t\geq a)$ are also mean-zero martingales. Since $W$ is locally of bounded variation, its quadratic variation is

[TABLE]

Now, consider the process $(\langle W\rangle(t),t\geq a)$ . The process $([W](t)-\langle W\rangle(t),t\geq a)$ is mean-zero martingale, by the definition of the sharp bracket. From equations (49), (50), and the fact that $\big{(}W_{+}(t)+W_{-}(t),t\geq a\big{)}$ is a mean-zero martingale, we have that

[TABLE]

Now, for $t\geq a$ , we define

[TABLE]

Because both $(X_{1m}^{(a,b]}(t),t\geq a)$ and $(I(t),t\geq a)$ are semimartingales, such that $(I(t),t\geq 0)$ has continuous paths and the process $(X_{1m}^{(a,b]}(t),t\geq a)$ is locally of bounded variation,

[TABLE]

Also, because

[TABLE]

for all $t\geq a$ , we have

[TABLE]

Using the Integration by Parts formula, we have

[TABLE]

Therefore, from (45) and (51),

[TABLE]

From (21) and (23), we have $B_{1}(t)\in[0,1]$ and $D_{1}(t)\in[0,1+\mu]$ for all $t\geq a$ . So, $G_{1}(t)\in[-1-\mu,1]$ for all $t\geq a$ . Thus,

[TABLE]

for all $t\geq a$ . Hence, for each $t\geq a$ , we have $E[\int_{0}^{t}I^{2}(u)d\langle W\rangle(u)]<\infty$ . Therefore, from (52), the process $\big{(}Z^{(a,b]}_{1m}(t),t\geq 0\big{)}$ is a square integrable martingale with

[TABLE]

This process has mean zero, because $Z^{(a,b]}_{1m}(a)=0$ . By Corollary 8.25 of [9], $\textup{Var}\big{(}Z^{(a,b]}_{1m}(t)\big{)}=E\big{[}\big{(}Z^{(a,b]}_{1m}(t)\big{)}^{2}\big{]}=E\big{[}\big{\langle}Z^{(a,b]}_{1m}\big{\rangle}(t)\big{]}$ , and this proves the variance formula by using (53) and (54). Lastly, because a stopped martingale is a martingale, the process $(Z_{1m}^{(a,b]}(t\wedge T),t\geq a)$ is a mean-zero martingale, and by the same argument above, we can get the variance formula for the process $(Z_{1m}^{(a,b]}(t\wedge T),t\geq a)$ . ∎

Since the process $((X_{0}(t),X_{1}(t)-X^{(a,b]}_{1m}(t),X^{(a,b]}_{1m}(t),X_{2}(t),X_{3}(t)),t\geq 0)$ is a continuous-time Markov chain, combining Proposition 7 and Markov property yields the following result.

Corollary 8.

If $T$ is a stopping time and $T\geq a$ , then for $i=1,2,3$ and $a\leq t$ ,

[TABLE]

and for $i=0,1,2,3$ and for $a\leq t$ ,

[TABLE]

Now, for $i=0,1,2,3$ and $0\leq a\leq t$ , we define

[TABLE]

By a similar argument to the one used in proving Proposition 7 and Corollary 8, we get the following result.

Proposition 9.

If $T$ is a stopping time with $T\geq a$ , then for $i=0,1,2,3$ , the process $(Z_{i}^{[a]}(t),t\geq a)$ is a martingale, and for all $a\leq t$ ,

[TABLE]

Lastly, for $i=1,2,3$ , for $0\leq a<b$ and $a\leq t$ , let us define

[TABLE]

and for $i=0,1,2,3$ , for $0\leq a<b$ and $a\leq t$ , we define

[TABLE]

Proposition 10.

If $T$ is a stopping time and $T\geq a$ , for $i=1,2,3$ , the process $(W_{im}^{(a,b]}(t\wedge T),t\geq a)$ is a submartingale, and for $a\leq t$ ,

[TABLE]

For $i=0,1,2,3$ the process $(W_{ir}^{(a,b]}(t\wedge T),t\geq a)$ is a submartingale, and for $a\leq t$ ,

[TABLE]

Proof.

Consider the process $\big{(}W_{im}^{(a,b]}(t\wedge T),t\geq a\big{)}$ . Because $B_{im}^{(a,b]}(t)\in[0,1]$ and $D_{im}^{(a,b]}(t)\in[0,1+\mu]$ , we have that $G_{im}^{(a,b]}(t)\in[-1-\mu,1]$ . Thus,

[TABLE]

for all $t\geq a$ . So, $E\big{[}W^{(a,b]}_{im}(t\wedge T)\big{]}<\infty$ for all $t\geq a$ .

From (45) and (56), for all $t\geq a$ ,

[TABLE]

For $a\leq t^{\prime}<t$ , by Proposition 7, we have

[TABLE]

Thus, the process $(W_{im}^{(a,b]}(t\wedge T),t\geq a)$ is a submartingale. From (57) and from the fact that the process $(Z_{im}^{(a,b]}(t\wedge T),t\geq a)$ is a mean-zero martingale by Proposition 7,

[TABLE]

The proof for the process $W^{(a,b]}_{ir}$ can be done by a similar argument. ∎

5 Phase 1 and the proof of Proposition 2

5.1 Notations

First, note that to prove Propositions 2, 3, 4 and 5, it is enough to prove that they hold for all small values of $\epsilon$ and $\delta$ . We choose $\epsilon$ and $\delta$ as follow:

[TABLE]

and

[TABLE]

We will now define several constants, fixed times, and stopping times. In both the recombination dominating case and the mutation dominating case, we pick the following constants:

[TABLE]

Next, we define several fixed times as follows:

[TABLE]

and in both cases, we define

[TABLE]

It follows from these definitions, the fact that $1\ll\mu$ , and the fact that $1\ll Nr$ in the recombination dominating case that for sufficiently large $N$ , we have $0<t_{0,m}<t_{0,m}^{+}<t_{1}$ and $0<t_{0,r}<t_{1}$ . Now, in both cases, we define the following stopping times:

[TABLE]

Lastly, we define the following events:

[TABLE]

Also, we define

[TABLE]

We will show that these events occur with high probability. Here, we will prove some inequalities involving $G_{1}(t),G_{2}(t)$ and $G_{3}(t)$ , which will be used quite often in this section.

Lemma 11.

For sufficiently large $N$ , and $t\in[0,t_{1}\wedge T_{(1)})$ , the following statements hold:

$X_{i}(t)\leq\eta N$ , for $i=1,2,3$ . 2. 2.

$G_{1}(t)\leq s$ , $G_{2}(t)\leq s$ , and $G_{3}(t)\leq 2s$ . 3. 3.

$G_{1}(t)\geq s-4\eta s-r-\mu$ , $G_{2}(t)\geq s-4\eta s-r-\mu$ , and $G_{3}(t)\geq 2s-4\eta s-r$ . 4. 4.

For $0<a<b$ , we have $G_{1r}^{(a,b]}(t)\leq s+r1_{(a,b]}(t)$ , $G_{2r}^{(a,b]}(t)\leq s+r1_{(a,b]}(t)$ , and $G_{3r}^{(a,b]}(t)\leq 2s+r1_{(a,b]}(t)$ . 5. 5.

For $0<a<b$ , we have $G_{1r}^{(a,b]}(t)\geq s-4\eta s-r-\mu$ , $G_{2r}^{(a,b]}(t)\geq s-4\eta s-r-\mu$ , and $G_{3r}^{(a,b]}(t)\geq 2s-4\eta s-r$ .

Proof.

By the definition of $\eta$ , $t_{1}$ and $T_{(1)}$ in (65), (69) and (73), for every $t\in[0,t_{1}\wedge T_{(1)})$ , and for $i=1,2,3$ ,

[TABLE]

For statement 2, since $0\leq\tilde{X}_{1}(t)+\tilde{X}_{2}(t)+\tilde{X}_{3}(t)\leq 1$ for all $t\geq 0$ , and $s\ll 1$ , it follows that for sufficiently large $N$ , we have $0<1-2s\leq 1-s\tilde{X}_{1}(t)-s\tilde{X}_{2}(t)-2s\tilde{X}_{3}(t)\leq 1$ for all $t\geq 0$ . Thus, by the definition of $G_{1}(t)$ in (32), for sufficiently large $N$ , we have $G_{1}(t)\leq s$ for all $t\in[0,t_{1}\wedge T_{(1)})$ . Also, by part 1, if $t\in[0,t_{1}\wedge T_{(1)})$ , then $1-\tilde{X}_{1}(t)-\tilde{X}_{2}(t)-2\tilde{X}_{3}(t)\geq 1-4\eta$ . Again, by using the definition of $G_{1}(t)$ in (32), we get the lower bound of $G_{1}(t)$ in statement 3. Both the upper and lower bounds for $G_{3}(t)$ can be shown by similar arguments. Lastly, we can prove statements 4 and 5 by using (34), (35) and (36) along with statements 1, 2 and 3 of this lemma. ∎

5.2 Upper bounds for expectations

In this section, we are going to prove some results on the upper bounds for the expectations of $X_{im}^{(a,b]}(t\wedge T_{(1)})$ and $X_{ir}^{(a,b]}(t\wedge T_{(1)})$ .

Lemma 12.

For sufficiently large $N$ , for $i=1,2$ and $t\in[0,t_{1}]$ , we have

[TABLE]

and

[TABLE]

Proof.

The proof is similar to Lemma 5.1 in [13]. We will show the proof for $i=1$ , since the argument is similar for $i=2$ . We will first show that for sufficiently large $N$ , for $0\leq a<b\leq t_{1}$ , and for $t\in[0,t_{1}]$ , we have

[TABLE]

If $t\in[0,a)$ , this inequality is trivial, since by the definition of $X_{1m}^{(a,b]}(t)$ , we have $X_{1m}^{(a,b]}(t)=0$ . Assume that $t\in[a,t_{1}]$ . By Proposition 7 and (45), we have $E\big{[}Z_{1m}^{(a,b]}(t\wedge T_{(1)})\big{]}=0$ , and

[TABLE]

Note that in the event that $T_{(1)}<a$ , we interpret the integral from $a$ to $t\wedge T_{(1)}$ as 0. Also, from the definition of $X_{1m}^{(a,b]}(t)$ , in the event $T_{(1)}<a$ , we have $X_{1m}^{(a,b]}(t\wedge T_{(1)})=0$ . Now, using the upper bound for $G_{1}(t)$ in Lemma 11, we know that for sufficiently large $N$ , for $0\leq a<b\leq t_{1}$ , and $t\in[a,t_{1}]$ ,

[TABLE]

Next, we use the lower bound for $G_{1}(t)$ in Lemma 11. From (18), for sufficiently large $N$ , for $0\leq a<b\leq t_{1}$ , and $t\in[a,t_{1}]$ ,

[TABLE]

From (86), (87) and (88), we have the inequality (85).

In the second part of the proof, for each $n\in\mathbb{N}$ , let $t^{\prime}_{j}=(b-a)j/n+a$ , for $j=0,1,...,n$ . It follows from

[TABLE]

By letting $n\rightarrow\infty$ , we have that for sufficiently large $N$ , $0\leq a<b\leq t_{1}$ , and $t\in[a,t_{1}]$ ,

[TABLE]

The inequality (84) follows from the fact that $X_{1m}(t\wedge T_{(1)})=X^{(0,t]}_{1m}(t\wedge T_{(1)})$ . From (89), it follows that

[TABLE]

and the proof is completed. ∎

Lemma 13.

For sufficiently large $N$ , for $i=1,2$ , and $t\in[0,t_{1}]$ , we have

[TABLE]

and

[TABLE]

Proof.

The proof is similar to the proof of Lemma 12. We will show the proof for $i=1$ , and the same argument can be used when $i=2$ . In the first part of this proof, we will show that for sufficiently large $N$ , for $0\leq a<b\leq t_{1}$ , and for $t\in[0,t_{1}]$ , we have

[TABLE]

If $t\in[0,a)$ , this inequality is trivial, since by the definition of $X_{1r}^{(a,b]}(t)$ , we have $X_{1r}^{(a,b]}(t)=0$ . Assume that $t\in[a,t_{1}]$ . By Proposition 7 and (46), we have $E\big{[}Z_{1r}^{(a,b]}(t\wedge T_{(1)})\big{]}=0$ , and

[TABLE]

Using the upper bound for $G^{(a,b]}_{1r}(t)$ in Lemma 11, we know that for sufficiently large $N$ , for $0\leq a<b\leq t_{1}$ , and $t\in[a,t_{1}]$ ,

[TABLE]

Then, using the lower bound for $G_{1r}^{(a,b]}(t)$ in Lemma 11, along with the upper bound for $R_{1}^{(a,b]}(t)$ in (38) and the definition of $T_{3}$ in (72), we have that for sufficiently large $N$ , for $0\leq a<b\leq t_{1}$ , and $t\in[a,t_{1}]$ ,

[TABLE]

From (93), (94) and (95), we have the inequality (92). Lastly, by using (92) and following the argument in the second part of the proof of Lemma 12, we can prove (90) and (91). ∎

Lemma 14.

For sufficiently large $N$ and for $t\in[0,t_{1}]$ , we have

[TABLE]

and

[TABLE]

Proof.

The argument in this proof is similar to that of Lemma 12. We will first show that for sufficiently large $N$ , for $0\leq a<b\leq t_{1}$ , and for $t\in[0,t_{1}]$ , we have

[TABLE]

If $t\in[0,a)$ , this inequality is trivial, since by the definition of $X_{3m}^{(a,b]}(t)$ , we have $X_{3m}^{(a,b]}(t)=0$ . Let assume that $t\in[a,t_{1}]$ . By Proposition 7 and (45), we have $E\big{[}Z_{3m}^{(a,b]}(t\wedge T_{(1)})\big{]}=0$ , and

[TABLE]

Using the upper bound for $G_{3}(t)$ in Lemma 11, we know that for sufficiently large $N$ , for $0\leq a<b\leq t_{1}$ , and $t\in[a,t_{1}]$ ,

[TABLE]

Now, we use the formula for $M_{3}^{(a,b]}(t)$ in (28), the lower bound for $G_{3}(t)$ in Lemma 11, and the definition of $T_{1}$ and $T_{2}$ in (70) and (71). It follows that for sufficiently large $N$ , $0\leq a<b\leq t_{1}$ , and $t\in[a,t_{1}]$ ,

[TABLE]

From (100), (101) and (102), we have the inequality (99). By following the argument in the second part of the proof of Lemma 12, it follows that for sufficiently large $N$ , when $0\leq a<t_{1}$ and $t\in[a,t_{1}]$ ,

[TABLE]

and

[TABLE]

The inequality (96) follows from (103) and the fact that $X_{3m}(t\wedge T_{(1)})=X^{(0,t]}_{3m}(t\wedge T_{(1)})$ , and the inequalities (97) and (98) follow from (104) and the definitions of $t_{0,m}$ and $t_{0,m}^{+}$ in (67) and (68). ∎

Lemma 15.

For sufficiently large $N$ and $0\leq a<t_{1}$ , if $t\in[0,t_{1}]$ , we have

[TABLE]

and if $t\in[a,t_{1}]$ ,

[TABLE]

Proof.

The proof is similar to the proof of Lemma 12. We will first show that for sufficiently large $N$ , for $0\leq a<b\leq t_{1}$ and $t\in[0,t_{1}]$ , we have

[TABLE]

If $t\in[0,a)$ , this inequality is trivial, since by the definition of $X_{3r}^{(a,b]}(t)$ , we have $X_{3r}^{(a,b]}(t)=0$ . Assume that $t\in[a,t_{1}]$ . By Proposition 7 and (46), we have $E\big{[}Z_{3r}^{(a,b]}(t\wedge T_{(1)})\big{]}=0$ , and

[TABLE]

Using the upper bound for $G^{(a,b]}_{3r}(t)$ in Lemma 11, we know that for sufficiently large $N$ , for $0\leq a<b\leq t_{1}$ , and $t\in[a,t_{1}]$ ,

[TABLE]

Then, we use the lower bound for $G_{3r}^{(a,b]}(t)$ in Lemma 11, along with the upper bound for $R_{3}^{(a,b]}(t)$ in (38) and the definitions of $T_{1}$ and $T_{2}$ in (70) and (71), we have that for sufficiently large $N$ , for $0\leq a<b\leq t_{1}$ , and $t\in[a,t_{1}]$ ,

[TABLE]

From (108), (109) and (110), we have the inequality (107). By similar argument to the second part of the proof of Lemma 12, we can show that for sufficiently large $N$ , for $0\leq a<b\leq t_{1}$ , and $t\in[a,t_{1}]$ ,

[TABLE]

and

[TABLE]

The inequality (105) follows from the inequality (111) and the fact that $X_{3r}^{(0,t]}(t\wedge T_{(1)})=X_{3r}(t\wedge T_{(1)})$ . The inequality (106) is a special case of the inequality (112) when $b=t_{1}$ . ∎

Using these upper bounds on expectations, we can prove that when $N$ is sufficiently large, the event $T_{(1)}>t_{1}$ occurs with probability close to 1, and the proof is shown below.

Lemma 16.

For sufficiently large $N$ , we have $P(A_{1}^{c})\leq 2\epsilon$ .

Proof.

Recall the definition of $A_{1}$ in (74). First, note that

[TABLE]

Now, consider the term $P(t_{1}\wedge T_{(1)}=T_{i})$ , for $i=1,2$ . Using Markov’s inequality, Lemmas 12 and 13, the definition of $t_{1}$ in (69), and (14), for sufficiently large $N$ ,

[TABLE]

Next, consider the term $P(t_{1}\wedge T_{(1)}=T_{3})$ . By Markov’s inequality, Lemma 14, Lemma 15, and using (14), for sufficiently large $N$ , we have

[TABLE]

Thus, from (113), (114), (115) and the way we choose $K$ and $C_{1}$ in (60) and (61), for sufficiently large $N$ , we have $P(T\leq t_{1})\leq 6K^{-1}+5Ke^{-C_{1}}\leq 2\epsilon$ . ∎

5.3 The variance bounds

By using the upper bounds for expectations, the variance formulas in Proposition 7, and the $L^{2}$ -maximal inequality, we can show that the probability that each of the events $A_{2},A_{3},A_{4},A_{5},A_{6},A_{7}$ occurs is at least $1-\epsilon$ .

Lemma 17.

The following statements hold:

For sufficiently large $N$ , and for $i=2,3,4,5,6$ , we have $P(A_{i}^{c})\leq\epsilon$ . 2. 2.

In the recombination dominating case, for sufficiently large $N$ , we have $P(A_{7}^{c})\leq\epsilon$ .

Proof.

Recall the definitions of the events $A_{2},A_{3},A_{4},A_{5},A_{6},A_{7}$ in (75) - (78). We will first prove that $P(A_{2}^{c})\leq\epsilon$ , when $N$ is sufficiently large. From (21), (23) and the facts that $\mu\ll s,r\ll s$ , and $s\ll 1$ , for sufficiently large $N$ and for $t\geq 0$ ,

[TABLE]

and

[TABLE]

From Proposition 7, Lemma 11, and Lemma 12, for sufficiently large $N$ ,

[TABLE]

From the definition of $t_{1}$ in (69) along with (14), and the facts that $\mu\ll s$ and $r\ll s$ , we have that

[TABLE]

By the way we choose $\epsilon,K$ and $\eta$ in (58), (60) and (65),

[TABLE]

By (116), (117) and the fact that $s\ll 1$ , for sufficiently large $N$ ,

[TABLE]

By the $L^{2}$ -maximal inequality, for sufficiently large $N$ ,

[TABLE]

Hence, we have shown that $P(A_{2}^{c})\leq\epsilon$ . The proof for $P(A_{3}^{c})\leq\epsilon$ is in fact the same as that for $P(A_{2}^{c})\leq\epsilon$ .

Now, we will prove that $P(A_{4}^{c})\leq\epsilon$ . From (24), (25) and the facts that $\mu\ll s,r\ll s$ , and $s\ll 1$ , for sufficiently large $N$ , for all $t\geq 0$ , we have $B_{1r}^{(0,t_{1}]}(t)\leq 1$ and $D_{1r}^{(0,t_{1}]}(t)\leq 1$ . From Proposition 7, Lemma 11, and inequality (38), for sufficiently large $N$ ,

[TABLE]

From Lemma 13 and the definition of $t_{1}$ in (69), for sufficiently large $N$ ,

[TABLE]

Therefore, from (117), (118), and the fact that $\mu\ll s\ll 1$ , for sufficiently large $N$ ,

[TABLE]

By the $L^{2}$ -maximal inequality, for sufficiently large $N$ ,

[TABLE]

We have proved that $P(A_{4}^{c})\leq\epsilon$ . The proof for $P(A_{5}^{c})\leq\epsilon$ is the same as the proof for $P(A_{4}^{c})\leq\epsilon$ .

Next, We will give a proof that $P(A_{6}^{c})\leq\epsilon$ . From (26), (27) and the facts that $\mu\ll s,r\ll s$ , and $s\ll 1$ . for sufficiently large $N$ , for all $t\geq t_{0,m}^{+}$ , we have $B_{3m}^{(t_{0,m}^{+},t_{1}]}(t)\leq 1$ and $D_{3m}^{(t_{0,m}^{+},t_{1}]}(t)\leq 1$ . From Proposition 7, Lemma 11, and the definitions of $T_{1}$ and $T_{2}$ in (70) and (71), for sufficiently large $N$ ,

[TABLE]

By Lemma 14, and the definition of $t_{0,m}^{+}$ in (68), for sufficiently large $N$ ,

[TABLE]

From (117) and (118) along with the fact that $s\ll 1$ , for sufficiently large $N$ ,

[TABLE]

By the $L^{2}$ -maximal inequality, we have that for sufficiently large $N$ ,

[TABLE]

Lastly, we will prove part 2. From (29), (30) and the fact that $\mu\ll s,r\ll s$ , and $s\ll 1$ , for sufficiently large $N$ , for all $t\geq 0$ , we have $B_{1r}^{(0,t_{1}]}(t)\leq 1$ and $D_{1r}^{(0,t_{1}]}(t)\leq 1$ . From Proposition 7, Lemma 11, inequality (39), and the definition of $T_{1}$ and $T_{2}$ in (70) and (71), for sufficiently large $N$ ,

[TABLE]

By Lemma 15 and the definitions of $t_{0,r}$ and $t_{1}$ in (66) and (69), for sufficiently large $N$ ,

[TABLE]

Because in the recombination dominating case, $1\ll Nr$ , by using the fact that $s\ll 1$ along with (117) and (118), we have that for sufficiently large $N$ ,

[TABLE]

By the $L^{2}$ -maximal inequality, for sufficiently large $N$ ,

[TABLE]

which proves part 2. ∎

5.4 Results on type 3 individuals

In this section, we will show that the events $A_{8}$ and $A_{9}$ as defined in (79) and (80) occur with high probability. That is with probability close to 1, there are no type 3m (or 3r) individuals at time $t_{1}$ that are descended from type 3m (or 3r) ancestors that appear before time $t_{0,m}$ (or $t_{0,r}$ ). The proof consists of two main ideas.

With probability close to 1, the number of type 3m (or 3r) ancestors that appear before time $t_{0,m}$ (or $t_{0,r}$ ) is small. 2. 2.

With probability close to 1, each of these early ancestors will not have alive descendant by time $t_{1}$ .

At the end of this subsection, we will show that the events $A_{10}$ and $A_{11}$ as defined in (81) and (82) also occur with high probability.

Lemma 18.

Define $m(t)$ and $\rho(t)$ to be the number of type 3m ancestors and 3r ancestors respectively that appear in the time interval $(0,t]$ . For sufficiently large $N$ , the following statements hold:

$P\bigg{(}m(t_{0,m}\wedge T_{(1)})\geq\frac{e^{-C_{0,m}/2}}{s}\bigg{)}\leq\epsilon.$ ** 2. 2.

$P\bigg{(}\rho(t_{0,r}\wedge T_{(1)})\geq\frac{e^{-C_{0,r}+1}}{s}\bigg{)}\leq\epsilon.$ **

Proof.

The process $(m(t),t\geq 0)$ is a pure birth process with total birth rate $M_{3}^{(0,t]}(t)$ as defined in (28). Then, there is a mean-zero martingale $(W^{\prime}(t),t\geq 0)$ such that for all $t\geq 0$ ,

[TABLE]

By Doob’s stopping theorem, $(W^{\prime}(t\wedge T_{(1)}),t\geq 0)$ is a mean-zero martingale. Thus,

[TABLE]

So, by Markov’s inequality and by the way we choose $C_{0,m}$ in (62),

[TABLE]

Now, consider the process $(\rho(t),t\geq 0)$ . By (31), the process is a pure birth process, and the birth rate at time $t$ is given by $R_{3}^{(0,t]}(t)$ as defined in (31). Then, there is a mean-zero martingale $(W^{\prime\prime}(t),t\geq 0)$ such that for all $t\geq 0$ ,

[TABLE]

By Doob’s stopping theorem, $(W^{\prime\prime}(t\wedge T_{(1)}),t\geq 0)$ is a mean-zero martingale. Thus,

[TABLE]

From the definition of $t_{0,r}$ in (66), if we are in the recombination dominating case or in the mutation dominating case with $Nr\geq e$ ,

[TABLE]

and in the mutation dominating case when $Nr<e$ , we have

[TABLE]

Hence, from (119),

[TABLE]

Lastly, by Markov’s inequality and the definition of $C_{0,r}$ in (64),

[TABLE]

∎

Lemma 19.

For $i\in\mathbb{N}$ , define $\tau_{i,m}$ to be the time that the ith type 3m ancestor appears, where we set $\tau_{i,m}=\infty$ if the ith type 3m ancestor never appears. Let $Y_{i,m}(t)$ be the number of descendants of the ith type 3m ancestor alive at time $t$ . Then, for sufficiently large N, for all $i\in\mathbb{N}$ ,

[TABLE]

Proof.

First, define $\tilde{Y}_{i,m}(t)=Y_{i,m}(t)/N$ for all $t\geq 0$ and $i\in\mathbb{N}$ . By following the same reasoning that led us to get the rates in (26) and (27), we have that on the event $\tau_{i,m}\leq t_{0,m}\wedge T_{(1)}$ , the process $(Y_{i,m}(t+\tau_{i,m}),t\geq 0)$ is a birth-death process with $Y_{i,m}(\tau_{i,m})=1$ , where each individual gives birth at rate

[TABLE]

and dies at rate

[TABLE]

Note that for $t\geq 0$ ,

[TABLE]

and

[TABLE]

For $t\geq 0$ , define $\lambda(t)=\int_{\tau_{i,m}}^{t+\tau_{i,m}}1-\tilde{Y}_{i,m}(v)dv$ . Define $Y_{i,m}^{*}(t)=Y_{i,m}(\lambda^{-1}(t)+\tau_{i,m})$ for $t\in[0,\lambda((t_{1}\wedge T_{(1)})-\tau_{i,m})]$ . The process $(Y^{*}_{i,m}(t),0\leq t<\lambda((t_{1}\wedge T_{(1)})-\tau_{i,m}))$ is a birth-death process with $Y^{*}_{i,m}(0)=1$ , where each individual gives birth at rate

[TABLE]

and dies at rate

[TABLE]

Let $(Y^{\#}(t),t\geq 0)$ be a birth-death process where $Y^{\#}(0)=1$ , where each individual gives birth at rate 1 and dies at rate $1-2s$ . From the generating function of birth and death process (in the section 5 of Chapter III of [1]), for $t\geq 0$ ,

[TABLE]

Since $1\ll N\mu$ , we have that for sufficiently large $N$ ,

[TABLE]

By Lemma 11 and (118), on the event $t_{1}<T_{(1)}$ , we have $Y_{i,m}(t)\leq X_{3}(t)\leq\eta N\leq\frac{N}{2}$ for all $t\in[0,t_{1}]$ , which implies that

[TABLE]

It is possible to couple the process $(Y^{\#}(t),t\geq 0)$ with the population process, such that 1) on the event $t_{1}<T_{(1)}$ , for any time $t$ , if $Y^{*}_{i,m}(t)>0$ , then $Y^{\#}(t)>0$ , and 2) the process $(Y^{\#}(t),t\geq 0)$ is independent of $\mathcal{F}_{\tau_{i,m}}$ . It follows that

[TABLE]

Lastly, using (122) and (121), we have

[TABLE]

∎

Lemma 20.

For $i\in\mathbb{N}$ , define $\tau_{i,r}$ to be the time that the ith type 3r ancestor appears, where we set $\tau_{i,r}=\infty$ , if the ith type 3r ancestor never appears. Let $Y_{i,r}(t)$ be the number of descendants of the ith type 3r ancestor alive at time $t$ . Then, for sufficiently large N, for all $i\in\mathbb{N}$ ,

[TABLE]

Proof.

The proof is similar to that of Lemma 19. First, define $\tilde{Y}_{i,r}(t)=Y_{i,r}(t)/N$ for all $t\geq 0$ and $i\in\mathbb{N}$ . By following the same reasoning that led us to get the rates in (29) and (30), we have that on the event $\tau_{i,r}\leq t_{0,r}\wedge T_{(1)}$ , the process $(Y_{i,r}(t+\tau_{i,r}),t\geq 0)$ is a birth-death process with $Y_{i,r}(\tau_{i,r})=1$ , where each individual gives birth at rate

[TABLE]

and dies at rate

[TABLE]

Note that when $t\geq 0$ , we have $b(t)\leq 1-\tilde{Y}_{i,r}(t+\tau_{i,r})$ .

For $t\geq 0$ , let $\lambda(t)=\int_{\tau_{i,r}}^{t+\tau_{i,r}}1-\tilde{Y}_{i,r}(v)dv$ . Define $Y_{i,r}^{*}(t)=Y_{i,r}(\lambda^{-1}(t)+\tau_{i,r})$ for $t\in[0,\lambda((t_{1}\wedge T_{(1)})-\tau_{i,r})]$ . The process $(Y^{*}_{i,m}(t),0\leq t<\lambda((t_{1}\wedge T_{(1)})-\tau_{i,r}))$ is a birth-death process with $Y^{*}_{i,r}(0)=1$ , where each individual gives birth at rate

[TABLE]

and dies at rate

[TABLE]

Since the function $\lambda$ is strictly increasing on the interval $[0,(t_{1}\wedge T_{(1)})-\tau_{i,r})$ , we have that if $t\in[0,\lambda((t_{1}\wedge T_{(1)})-\tau_{i,r}))$ , then $\lambda^{-1}(t)+\tau_{i,r}(t)\leq t_{1}\wedge T_{(1)}.$ Hence, from Lemma 11, for every $t\in[0,\lambda((t_{1}\wedge T_{(1)})-\tau_{i,r}))$ and $j=1,2$ and 3, we have $\tilde{X}_{j}(\lambda^{-1}(t)+\tau_{i,r})\leq\eta$ , and $\tilde{Y}_{i,r}(\lambda^{-1}(t)+\tau_{i,r})\leq\tilde{X}_{3}(\lambda^{-1}(t)+\tau_{i,r})\leq\eta$ . Now, because $r\ll s$ , by (124), for sufficiently large $N$ , for $t\in[0,\lambda((t_{1}\wedge T_{(1)})-\tau_{i,r}))$ ,

[TABLE]

Let $(Y^{\#}(t),t\geq 0)$ be a birth-death process where $Y^{\#}(0)=1$ , where each individual gives birth at rate 1 and dies at rate $1-3s$ . By the same argument we used to get (120), for $t\geq 0$ ,

[TABLE]

We claim that for sufficiently large $N$ ,

[TABLE]

From (125) and the definition of $C_{0,r}$ in (64), in the recombination dominating case and the mutation dominating case with $Nr\geq e$ , we have that for sufficiently large $N$ ,

[TABLE]

and in the mutation dominating case with $Nr\leq e$ , we also have

[TABLE]

On the event $t_{1}<T_{(1)}$ , using (118), we have $Y_{i,r}(t)\leq X_{3}(t)\leq\eta N\leq\frac{N}{3}$ for all $t\in[0,t_{1}]$ . By following the same reasoning in (122),

[TABLE]

It is possible to couple the process $(Y^{\#}(t),t\geq 0)$ with the population process, such that 1) on the event $t_{1}<T_{(1)}$ , for any time $t$ , if $Y^{*}_{i,m}(t)>0$ , then $Y^{\#}(t)>0$ , and 2) the process $(Y^{\#}(t),t\geq 0)$ is independent of $\mathcal{F}_{\tau_{i,r}}$ . By the same reasoning we used to get (123), it follows that for sufficiently large $N$ ,

[TABLE]

∎

Now, we are ready to show that the events $A_{8}$ and $A_{9}$ occur with probability close to 1.

Lemma 21.

For sufficiently large $N$ , we have $P(A_{8}^{c})\leq 4\epsilon$ , and $P(A_{9}^{c})\leq 4\epsilon$ .

Proof.

Recall the definitions of $A_{8}$ and $A_{9}$ in (79) and (80). We will only show that $P(A_{8}^{c})\leq 4\epsilon$ . The same reasoning can be used to prove that $P(A_{9}^{c})\leq 4\epsilon$ .

Let $J=\lfloor e^{-C_{0,m}/2}/s\rfloor$ . By Lemma 19, we have that for sufficiently large $N$ ,

[TABLE]

Hence, by Lemma 16 and Lemma 18 along with the way we choose $\epsilon,K$ and $C_{0,m}$ in (58), (60) and (62), for sufficiently large $N$ ,

[TABLE]

So, this prove that $P(A_{9}^{c})\leq 4\epsilon$ . ∎

Lemma 22.

For sufficiently large $N$ , we have $P(A_{10}^{c})\leq\epsilon$ , and $P(A_{11}^{c})\leq\epsilon$ .

Proof.

Recall the definition of $A_{10}$ in (81). From Lemma 14 and the definition of $t_{1}$ in (69), for sufficiently large $N$ ,

[TABLE]

and from the Markov’s inequality, we get that $P(A_{10}^{c})\leq\epsilon$ .

Now, recall the definition of $A_{11}$ in (82). We will first consider the recombination dominating case and the mutation dominating case with $Nr\geq e$ . Recall that $Nr\gg 1$ in the recombination dominating case. From Lemma 15 and the definition of $t_{0,r}$ in (66), for sufficiently large $N$ ,

[TABLE]

In the mutation dominating case with $Nr<e$ , from Lemma 15 and the definition of $t_{0,r}$ in (66), for sufficiently large $N$ ,

[TABLE]

Thus, in both cases, for sufficiently large $N$ ,

[TABLE]

and $P(A_{11}^{c})\leq\epsilon$ is followed from the Markov’s inequality. ∎

Before we prove Proposition 2, we will give both upper and lower bounds of the numbers of type 1 and 2 individuals on the event $A_{(1)}$ .

Lemma 23.

The following statements hold:

On the event $A_{(1)}$ , for $i=1,2$ , for sufficiently large $N$ and for $t\in[0,t_{1}]$ ,

[TABLE] 2. 2.

In the recombination dominating case, on the event $A_{(1)}$ , for $i=1,2$ , for sufficiently large $N$ , and for every $t\in[t_{0,r},t_{1}]$ , we have

[TABLE] 3. 3.

In the mutation dominating case, on the event $A_{(1)}$ , for $i=1,2$ , for sufficiently large $N$ , and for $t\in[t_{0,m},t_{1}]$ , we have

[TABLE]

Proof.

In this proof, we assume that we are on the event $A_{(1)}$ . From (47), we have that for all $t\in(0,t_{1}]$ ,

[TABLE]

From Lemma 11, definitions of $A_{1}$ and $A_{2}$ in (74) and (75), and the fact that $1\ll N\mu$ , for sufficiently large $N$ and for $t\in(0,t_{1}]$ ,

[TABLE]

Next, from (48), we have that for all $t\in(0,t_{1}]$ ,

[TABLE]

From (38), Lemma 11, and definitions of $A_{1}$ and $A_{4}$ in (74) and (76), for sufficiently large $N$ and for $t\in(0,t_{1}]$ ,

[TABLE]

By the definition of $T_{3}$ in (72), inequalities (14), (117) and the fact that $1\ll N\mu$ , it follows that for sufficiently large $N$ and for $t\in(0,t_{1}]$ ,

[TABLE]

Therefore, for sufficiently large $N$ , for all $t\in[0,t_{1}]$ , we have

[TABLE]

Note that by similar argument, we can also prove the upper bound for $X_{2}(t)$ .

To prove the lower bound for $X_{1}(t)$ in the recombination dominating case, we first need to consider the term $\int_{u}^{t}G_{1}(v)dv$ . By using (32), part 1 of this lemma and the definition of $T_{3}$ in (72), we have that when $N$ is sufficiently large, for $0\leq u<t\leq t_{1}$ ,

[TABLE]

Now, using the fact that $\delta<1$ , the definition of $t_{1}$ in (69) along with (117), we have that when $N$ is sufficiently large, for $0\leq u<t\leq t_{1}$ ,

[TABLE]

Also, using part 1 of this lemma, the definition of $T_{3}$ in (72) and the fact that $\delta<1$ , for sufficiently large $N$ , and for $u\in[0,t_{1}]$ ,

[TABLE]

Thus, from (126), (127), (128), along with the definition of $A_{4}$ in (76) for sufficiently large $N$ , for all $t\in[t_{0,r},t_{1}]$ ,

[TABLE]

In the recombination dominating case, we have $N\mu^{2}\ll s$ and $r\ll s$ . So, by using the definition of $t_{0,r}$ in (66), we have that

[TABLE]

Thus, from (129), (117), and the way we choose $C_{1}$ as in (61), for sufficiently large $N$ , and for all $t\in[t_{0,r},t_{1}]$ ,

[TABLE]

The proof for the mutation dominating case is almost exactly the same as that of the recombination dominating case by replacing $t_{0,r}$ by $t_{0,m}$ , and using that because $N\mu^{2}\ll s$ , we have

[TABLE]

which completes the proof. ∎

5.5 The proof of Proposition 2

Proof.

By the definition of $A_{(1)}$ in (83) and Lemmas 16, 17, 21, and 22, for sufficiently large $N$ , we have that $P(A_{(1)})\geq 1-17\epsilon$ . From now on, we will assume that we are working on the event $A_{(1)}$ . The statement 1 follows from Lemma 23 by inserting $t=t_{1}$ .

Now consider $X_{3}(t_{1})$ . From the definitions of $A_{8},A_{9},A_{10}$ and $A_{11}$ , in (79), (80), (81) and (82), it follows that

[TABLE]

and

[TABLE]

In the recombination dominating case, $Nr\gg 1$ and $r$ satisfy (6). It follows from (130) and (131) that if $N$ is sufficiently large, then

[TABLE]

So, we choose the positive constant

[TABLE]

Next, consider the mutation dominating case. In this case, $r$ satisfies (7), and we also have that $1\ll N\mu$ . It follows from (130) and (131) that if $N$ is sufficiently large, then

[TABLE]

and

[TABLE]

Thus, we choose the positive constant

[TABLE]

Now, we will show the lower bound of $X_{3}(t_{1})$ . First, consider the recombination dominating case. To prove the lower bound, we will need to consider the term $\int_{u}^{t_{1}}G^{(t_{0,r},t_{1}]}_{3r}(v)dv$ . Similar to the way we get (127) by using (36) instead of (34), for $t_{0,r}\leq u\leq t_{1}$ ,

[TABLE]

By (117), when $N$ is sufficiently large, for $t_{0,r}\leq u\leq t_{1}$

[TABLE]

By (31) and Lemma 23, for sufficiently large $N$ , and for $t\in[t_{0,r},t_{1}]$ ,

[TABLE]

Using (48), (132), (133), Lemma 11 and the definitions of $A_{7}$ in (78), for sufficiently large $N$ ,

[TABLE]

It follows from the definitions of $t_{1}$ and $t_{0,r}$ in (69) and (66) that for sufficiently large $N$ ,

[TABLE]

By (117) and the fact that $1\ll Nr$ , we have that for sufficiently large $N$

[TABLE]

and we choose the positive constant

[TABLE]

Lastly, consider the mutation dominating case. By the same argument we used to obtain (132), we have that for sufficiently large $N$ and for $t_{0,m}\leq u\leq t_{1}$ ,

[TABLE]

From (47), Lemma 11, Lemma 23, and the definition of $A_{6}$ in (77), for sufficiently large $N$ ,

[TABLE]

Using the definitions of $t_{1}$ and $t_{0,m}$ in (69) and (67), and the fact that $1\ll N\mu$ , for sufficiently large $N$ ,

[TABLE]

Note that the way we define $C_{0,m}^{+}$ in (63) is precisely to make

[TABLE]

Hence, we choose the positive constant

[TABLE]

This completes the proof. ∎

6 Phase 2 and the proof of Proposition 3

6.1 Comparing the Markov chain with a differential equation

Theorem 24 below is a special case of Theorem 4.1 of [6]. Let $(\mathbf{X}(t),t\geq 0)$ be a continuous time Markov chain with finite state space $S\subset\mathbb{R}^{3}$ . Let $q(\xi,\xi^{\prime})$ be the jump rate from the state $\xi$ to the state $\xi^{\prime}$ . For each state $\xi\in S$ , define the function $\alpha:S\rightarrow\mathbb{R}$ by

[TABLE]

where $|\cdot|$ is the Euclidean norm, and define the function $\beta:S\rightarrow\mathbb{R}^{3}$ by

[TABLE]

It follows that

[TABLE]

for some martingale $(M(t),t\geq 0)$ .

Let $b:[0,1]^{3}\rightarrow\mathbb{R}^{3}$ be a Lipschitz function with Lipschitz constant $K$ . Let $x:[0,\infty)\rightarrow\mathbb{R}^{3}$ be the function that satisfies

[TABLE]

The goal is to compare $\mathbf{X}(t)$ with $x(t)$ .

Fix $T>0$ , $\epsilon_{0}>0$ , $L>0$ , and let $\Delta=\epsilon_{0}e^{-KT}/3$ . Define the events

[TABLE]

Theorem 24.

Under all the assumptions above,

[TABLE]

Now, we will apply this theorem to our process $((X_{0}(t),X_{1}(t),X_{2}(t),X_{3}(t)),t\geq 0)$ . First, for $t\geq 0$ , we define

[TABLE]

and $S=\{(\xi_{1},\xi_{2},\xi_{3})\in\{0,\frac{1}{N},...,\frac{N-1}{N},1\}^{3}:\xi_{1}+\xi_{2}+\xi_{3}\leq 1\}$ . We are thinking of $\xi_{1},\xi_{2}$ and $\xi_{3}$ as the fractions of type 1, 2 and 3 individuals in the population. For better understanding in the following formulas, we will define $\xi_{0}=1-\xi_{1}-\xi_{2}-\xi_{3}$ , which represents the fraction of type 0 individuals in the population. Now, for each $\xi=(\xi_{1},\xi_{2},\xi_{3})\in S$ , we define

[TABLE]

Note that for each $i=0,1,2,3$ , the quantity $f_{i}(\xi)$ represents the probability that a new individual born is of type $i$ . Next, for $\xi=(\xi_{1},\xi_{2},\xi_{3})\in S$ and $\xi^{\prime}=(\xi^{\prime}_{1},\xi^{\prime}_{2},\xi^{\prime}_{3})\in S$ , the transition rate $q(\xi,\xi^{\prime})$ is given by

[TABLE]

The reasons behind the formulas for these rates are similar to the ones we used to obtain the birth and death rates in section 3.2. Let us consider the first rate. It is the rate that the number of type 0 individuals decreases by 1 and the number of type 1 individuals increases by 1. There are two ways for this to occur: 1) a type 0 individual mutates to type 1, which occurs at total rate of $\mu N\xi_{0}$ , and 2) a type 0 individual dies and is replaced by a type 1 individual. The total rate that a type 0 individual dies is $N\xi_{0}$ , and the probability that the replacement is of type 1 is $f_{1}(\xi)$ .

We define the functions $\alpha$ and $\beta$ as in (134) and (135). For $\xi,\xi^{\prime}\in S$ such that $q(\xi,\xi^{\prime})\neq 0$ , we have $|\xi-\xi^{\prime}|^{2}\leq 2/N^{2}$ , since it is equal to $1/N^{2}$ or $2/N^{2}$ . Because for each $i=0,1,2,3$ and $\xi\in S$ , we have $0\leq f_{i}(\xi)\leq 1$ , and because $\mu\ll s\ll 1$ , it follows that for sufficiently large $N$ , for all $\xi,\xi^{\prime}\in S$ , we have $q(\xi,\xi^{\prime})\leq 2N$ . By the definition of $\alpha$ in (134), for sufficiently large $N$ ,

[TABLE]

For each $\xi\in S$ , we define

[TABLE]

A tedious calculation gives

[TABLE]

Note that for $i=1,2,3$ , the $i^{th}$ row of $N\beta(\xi)$ is exactly the rate at which the number of type $i$ individuals increases by 1 minus the rate at which the number of type $i$ individuals decreases by 1.

Here, we define the functions $b:[0,1]^{3}\rightarrow\mathbb{R}^{3}$ and $\tilde{b}:[0,1]^{3}\rightarrow\mathbb{R}^{3}$ by

[TABLE]

and

[TABLE]

Since all first partial derivatives of $\tilde{b}$ are bounded, the function $\tilde{b}$ is Lipschitz. Hence, $b$ is also Lipschitz with Lipschitz constant $ks$ , where $k>0$ and $k$ does not depend on $N$ .

Now, we define a random variable $B$ such that on the event that $\tilde{X}_{1}(t_{1})+\tilde{X}_{2}(t_{1})>0$ , we have

[TABLE]

The value of $B$ on the event that $\tilde{X}_{1}(t_{1})+\tilde{X}_{2}(t_{1})=0$ is not of interest, as we will work only on the event $A_{(1)}$ when $N$ is sufficiently large. By Proposition 2, we know that $\tilde{X}_{1}(t_{1})+\tilde{X}_{2}(t_{1})>0$ on the event $A_{(1)}$ . Next, for $t\geq t_{1}$ , we define

[TABLE]

and for $t\geq t_{1}$ , we let

[TABLE]

Note that for $i=1,2$ , we have $x_{i}(t_{1})=\tilde{X}_{i}(t_{1})$ , and for all $t\geq t_{1}$ , we have $x_{1}(t)+x_{2}(t)=f(t)$ . From (143), for $t\geq t_{1}$ ,

[TABLE]

and it follows that

[TABLE]

From (140), (144), (143), and the fact that $x_{1}(t)+x_{2}(t)=f(t)$ for all $t\geq t_{1}$ , we have that for $t\geq t_{1}$ ,

[TABLE]

Therefore, for $t\geq t_{1}$ , we have $\frac{d}{dt}x(t)=b(x(t))$ , and

[TABLE]

We pick the constant

[TABLE]

and we define

[TABLE]

We will use Theorem 24 to show that with probability almost 1, both $X_{1}(t)$ and $X_{2}(t)$ are close to $x_{1}(t)N$ and $x_{2}(t)N$ for $t\in[t_{1},t_{2}]$ . We define the event

[TABLE]

Lemma 25.

For sufficiently large $N$ , we have $P(A_{12}^{c}|\hskip 2.84544pt\mathcal{F}_{t_{1}})\leq\epsilon$ on the event $A_{(1)}$ .

Proof.

Let $\Delta=\delta^{4}e^{-k(C_{2}+C_{1})}/12$ . We will first prove that for sufficiently large $N$ , on the event $A_{(1)}$ ,

[TABLE]

By (139) and (140), we have

[TABLE]

Because $\tilde{X_{i}}(t)\in[0,1]$ for all $i=1,2,3$ , and $t\geq 0$ , we have

[TABLE]

for some positive constants $D$ and $D^{\prime}$ . Thus,

[TABLE]

In the recombination dominating case, since $r\ll s$ , $\mu\ll s$ , $1\ll N\mu$ , and $r\ln_{+}(Nr)\ll s$ , if $N$ is sufficiently large, then

[TABLE]

and

[TABLE]

In the mutation dominating case, since $r\ll s$ , $\mu\ll s$ , an $N\mu^{2}\ll s$ , if $N$ is sufficiently large, then (150) holds and

[TABLE]

In this proof, we assume that $N$ is large enough so that in the recombination dominating case, (149), (150) and (151) hold, and in the mutation dominating case, (149), (150) and (152) hold.

Now, let us consider the process $(\mathbf{X}(t),t\geq 0)$ . By Markov property of the process, if we condition on $\mathcal{F}_{t_{1}}$ , the process after time $t_{1}$ behaves as if we start the whole process again with $\mathbf{X}(t_{1})$ as the initial condition. Now, let us fix the value of $\mathbf{X}(t_{1})=(\xi_{1},\xi_{2},\xi_{3})$ , and consider the process starting at time $t_{1}$ with this initial condition. Note that by starting the process from this fixed start point, the function $f$ and $x$ defined in (143) and (144) are no longer random, which allows us to use Theorem 24.

We define $T=t_{2}-t_{1}$ , and note that $\Delta=\delta^{4}e^{-k(C_{2}+C_{1})}/12=(\delta^{4}/4)\cdot e^{-(ks)T}/3$ , which is in the form required in order to use Theorem 24. We let $L=48/N$ and define the events

[TABLE]

First, we consider $\Omega_{0}$ . In the recombination dominating case, if $\mathbf{X}(t_{1})=(\xi_{1},\xi_{2},\xi_{3})$ satisfies (8) and (9), then by (151), we have

[TABLE]

Similarly, in the mutation dominating case, if $\mathbf{X}(t_{1})=(\xi_{1},\xi_{2},\xi_{3})$ satisfies (8) and (10), then by (152), we have

[TABLE]

Next, because of (148) and (150), we have that $\Omega_{1}^{c}=\emptyset$ . Lastly, by (138), it follows that

[TABLE]

So, $\Omega_{2}^{c}=\emptyset$ .

Therefore, if $\mathbf{X}(t_{1})=(\xi_{1},\xi_{2},\xi_{3})$ satisfies (8) and (9) in the recombination dominating case, or satisfies (8) and (10) in the mutation dominating case, by Theorem 24 and (149), we have that

[TABLE]

Note that the upper bound does not depend on the value of $(\xi_{1},\xi_{2},\xi_{3})$ . By Proposition 2, on the event $A_{(1)}$ , we know that $\mathbf{X}(t_{1})=(\xi_{1},\xi_{2},\xi_{3})$ satisfies (8) and (9) in the recombination dominating case, and satisfies (8) and (10) in the mutation dominating case for sufficiently large $N$ . Using the Markov property of the process, we have that on the event $A_{(1)}$ ,

[TABLE]

Thus, from the definition of the event $A_{12}$ in (147), on the event $A_{(1)}$ ,

[TABLE]

which completes the proof. ∎

6.2 Results on type 3 individuals

We will now show that for sufficiently large $N$ , with probability close to 1, $X_{3}(t_{2})$ has the same order as $(Nr\ln(Nr))/s$ in the recombination dominating case, and has the same order as $(N^{2}\mu^{2})/s$ in the mutation dominating case. The proof mainly has two parts. In the first part, we will show that $X_{3}^{[t_{1}]}(t_{2})$ , which was defined to be the number of type 3 individuals at time $t$ that descend from the type 3 individuals at time $t_{1}$ , has order $(Nr\ln(Nr))/s$ in the recombination dominating case, and $(N^{2}\mu^{2})/s$ in the mutation dominating case. In the second part, we show that $X_{3m}^{(t_{1},t_{2}]}(t_{2})$ and $X_{3r}^{(t_{1},t_{2}]}(t_{2})$ are much smaller than $X_{3}^{[t_{1}]}(t_{2})$ .

Lemma 26.

For sufficiently large $N$ , for all $t\geq t_{1}$

[TABLE]

Proof.

From (55) and Proposition 9, we have that for $t\geq t_{1}$ ,

[TABLE]

Because of Lemma 11, for sufficiently large $N$ ,

[TABLE]

Thus, for sufficiently large $N$ ,

[TABLE]

which proves this lemma. ∎

Lemma 27.

The following statements hold:

In the recombination dominating case, there is a positive constant $K_{0r}$ , such that for sufficiently large $N$ , on the event $A_{(1)}$ , we have

[TABLE] 2. 2.

In the mutation dominating case, there is a positive constant $K_{0m}$ , such that for sufficiently large $N$ , on the event $A_{(1)}$ , we have

[TABLE]

Proof.

First, consider the recombination dominating case. From (41) and (42), for all $t\geq 0$ , we have that $B_{3}^{[t_{1}]}(t)\leq 1$ and $D_{3}^{[t_{1}]}(t)\leq 1$ . Also, from (33) and the fact that $s\ll 1$ , for sufficiently large $N$ , for all $t\geq 0$ ,

[TABLE]

By Proposition 9, (153), and Lemma 26, for sufficiently large $N$ ,

[TABLE]

By Proposition 2 and the definitions of $t_{1}$ and $t_{2}$ in (69) and (146), for sufficiently large $N$ , on the event $A_{(1)}$ ,

[TABLE]

We define

[TABLE]

Since the process $\big{(}Z_{3}^{[t_{1}]}(t),t\geq 0)$ is a martingale, we have that $E[Z_{3}^{[t_{1}]}(t_{2})|\mathcal{F}_{t_{1}}]=Z_{3}^{[t_{1}]}(t_{1})=X_{3}(t_{1})$ . Hence, by Chebyshev’s inequality, we have that for sufficiently large $N$ , on the event $A_{(1)}$ ,

[TABLE]

For the mutation dominating case, the proof is almost exactly the same. The only difference is the inequality (154), for which Proposition 2 gives that

[TABLE]

In this case, we pick

[TABLE]

This completes the proof. ∎

Lemma 28.

There exist positive constants $K^{\prime}_{1}$ and $K^{\prime}_{2}$ such that for sufficiently large $N$ , we have

$\displaystyle P\bigg{(}X_{3m}^{(t_{1},t_{2}]}(t_{2})\geq\frac{K^{\prime}_{1}}{\epsilon}\cdot\frac{N\mu}{s}\hskip 2.84544pt\Big{|}\hskip 2.84544pt\mathcal{F}_{t_{1}}\bigg{)}\leq\epsilon.$ ** 2. 2.

$\displaystyle P\bigg{(}X_{3r}^{(t_{1},t_{2}]}(t_{2})\geq\frac{K^{\prime}_{2}}{\epsilon}\cdot\frac{Nr}{s}\hskip 2.84544pt\Big{|}\hskip 2.84544pt\mathcal{F}_{t_{1}}\bigg{)}\leq\epsilon.$ **

Proof.

We will first prove part 1. Let $U(t)$ and $V(t)$ be the numbers of times that the number of type $3m(t_{1},t_{2}]$ individuals increases and decreases respectively during the time interval $[t_{1},t]$ . Then, for $t\geq t_{1}$ , we define

[TABLE]

Then, both processes $(W_{+}(t),t\geq t_{1})$ and $(W_{-}(t),t\geq t_{1})$ are mean-zero martingales, and so is the process $(W_{m}(t),t\geq t_{1})$ . We also have that

[TABLE]

Thus, from Lemma 11, for sufficiently large $N$ , if $t\in[t_{1},t_{2}]$ , then

[TABLE]

Here, we define

[TABLE]

From Gronwall’s inequality, we have

[TABLE]

and by Markov’s inequality, we have that

[TABLE]

Now, we will prove part 2. The proof is similar to the the proof for part 1. First, we have that there is a mean-zero martingale $(W_{r}(t),t\geq t_{1})$ such that

[TABLE]

for all $t\geq t_{1}$ . From (31), Lemma 11 and $r\ll s$ , for sufficiently large $N$ , and for $t\in[t_{1},t_{2}]$ ,

[TABLE]

We define

[TABLE]

From Gronwall’s inequality, we have

[TABLE]

and the result follows from Markov’s inequality. ∎

Recall the constants $K_{0r},K_{0m},K_{1}^{\prime}$ and $K_{2}^{\prime}$ defined in (155), (156), (157) and (158). Now, we define the following events in both cases:

[TABLE]

In the recombination dominating case, we define

[TABLE]

while in the mutation dominating case, we define

[TABLE]

Lastly, in both cases, we define

[TABLE]

Lemma 29.

On the event $A_{(1)}$ , for sufficiently large $N$ , and for $i=1,2$ , we have

[TABLE]

and

[TABLE]

Proof.

First note that if $c>0$ , then the function $g(x)=x/(x+c)$ is increasing on the interval $(0,\infty)$ . Then, from Proposition 2, on the event $A_{(1)}$ , for sufficiently large $N$ ,

[TABLE]

and

[TABLE]

By the same argument, we get the same bounds for $\tilde{X}_{2}(t_{1})/(\tilde{X}_{1}(t_{1})+\tilde{X}_{2}(t_{1}))$ .

Now, recall the definitions of $B$ and $f$ in (142) and (143). By Proposition 2, (164) and the definitions of $t_{1},t_{2}$ and $C_{2}$ in (69), (145) and (146), for sufficiently large $N$ , on the event $A_{(1)}$ ,

[TABLE]

and

[TABLE]

From the way we define $\delta$ and $C_{1}$ in (59) and (61), we have $e^{C_{1}}>8/\delta^{2}>32>15/2$ , and

[TABLE]

Thus, we have

[TABLE]

This completes the proof of this lemma. ∎

6.3 The proof of Proposition 3

Proof.

Recall the definition of $A_{(2)}$ in (163). From Proposition 25, Lemma 27 and Lemma 28, for sufficiently large $N$ , on the event $A_{(1)}$

[TABLE]

Thus, from Proposition 2, we have

[TABLE]

From now on, we will work on the event $A_{(2)}$ . By the definition of the event $A_{12}$ in (147), the definition of the function $x$ in (144), and Lemma 29, for sufficiently large $N$ , on the event $A_{(2)}$ ,

[TABLE]

and

[TABLE]

Both the upper and lower bounds for $X_{2}(t_{2})$ follow from the same argument.

Now, we prove statement 2. Assume that we are in the recombination dominating case. By the definition of $Z_{3}^{[t_{1}]}(t)$ in (55), the definition of $A_{15}$ in (161), the inequality (153) and Proposition 2, on the event $A_{(2)}$ ,

[TABLE]

We define

[TABLE]

Because $1\ll Nr$ and $r\ll s$ , for sufficiently large $N$ ,

[TABLE]

By the definitions of $A_{13},A_{14}$ and $A_{15}$ in (159), (160) and (161), and by Proposition 2, we have that for sufficiently large $N$ , on the event $A_{(2)}$ ,

[TABLE]

We define the constant

[TABLE]

Because $1\ll N\mu$ and $\mu\ll r$ , for sufficiently large $N$ ,

[TABLE]

Lastly, consider the mutation dominating case, where we will prove statement 3. The proof is similar to the proof of part 3. First, by using (162) instead of (161), for sufficiently large $N$ , on the event $A_{(2)}$ ,

[TABLE]

We define

[TABLE]

Since $1\ll N\mu$ , for sufficiently large $N$ , on the event $A_{(2)}$ ,

[TABLE]

By the definitions of $A_{13},A_{14}$ and $A_{15}$ in (159), (160) and (162), and by Proposition 2, we have that for sufficiently large $N$ , on the event $A_{(2)}$ ,

[TABLE]

We define the constant

[TABLE]

Because $1\ll N\mu$ and $r\ll N\mu^{2}$ , for sufficiently large $N$ ,

[TABLE]

This completes the proof. ∎

7 Phase 3 and the proof of Proposition 4

In this phase, we will use martingales and submartingales to approximate the number of type 0 and type 3 individuals. The ideas of the proof are similar to those used in phase 1. At the end of this section, we will give a proof for Proposition 4.

We define the constant

[TABLE]

where the constants $K_{2r}^{+}$ and $K_{2m}^{+}$ are defined in the equations (165) and (166). We define the time

[TABLE]

Next, we define the following stopping times:

[TABLE]

In both cases we define

[TABLE]

In the recombination dominating case, we define the following events:

[TABLE]

In contrast, in the mutation dominating case, we define

[TABLE]

Lastly, in both cases, we define

[TABLE]

We will first give bounds on the growth rates of type 0 and type 3 populations.

Lemma 30.

The following statements are true.

If $t\in[t_{2},T_{4})$ , then $G_{0}(t)\leq-s(1-3\delta)$ . 2. 2.

If $t\in[t_{2},T_{4})$ , then $-s(1+\tilde{X}_{3}(t))-r-2\mu\leq G_{0r}^{(t_{2},t_{3}]}(t)\leq-s(1-3\delta)+r$ . 3. 3.

If $t\in[t_{2},T_{6})$ , then $s(1-\tilde{X}_{3}(t))-r\leq G_{3}(t)\leq s\big{(}1+\delta e^{-s(1-3\delta)(t-t_{2})}\big{)}$ . 4. 4.

If $t\in[t_{2},T_{6})$ , then $s(1-\tilde{X}_{3}(t))-r\leq G^{(t_{2},t_{3}]}_{3r}(t)\leq s\big{(}1+\delta e^{-s(1-3\delta)(t-t_{2})}\big{)}+r$ .

Proof.

By the definition of $T_{4}$ in (169), if $t\in[t_{2},T_{4})$ , then $\tilde{X}_{1}(t)+\tilde{X}_{2}(t)>1-3\delta$ , and from (43), we have that $G_{0}(t)\leq-s(\tilde{X}_{1}(t)+\tilde{X}_{2}(t))<-s(1-3\delta)$ . From (44), if $t\in[t_{2},T_{4})$ , then $G_{0r}^{(t_{2},t_{3}]}(t)\leq-s(1-3\delta)+r$ , and by using the fact that $\tilde{X}_{1}(u)+\tilde{X}_{2}(u)+\tilde{X}_{3}(u)\leq 1$ for all $u\geq 0$ , we also have that $G_{0r}^{(t_{2},t_{3}]}(t)\geq-s(1+\tilde{X}_{3}(t))-r-2\mu$ .

Now, from the definition of $T_{6}$ in (171), if $t\in[t_{2},T_{6})$ , then the equation (33) implies that $G_{3}(t)\leq s(1+\tilde{X}_{0}(t))<s\big{(}1+\delta e^{-s(1-3\delta)(t-t_{2})}\big{)}$ , and $G_{3}(t)\geq s(1-\tilde{X}_{3}(t))-r$ . Part 4 follows directly from part 3 and (36). ∎

7.1 Results on type 0 individuals

Lemma 31.

For sufficiently large $N$ , on the event $A_{(2)}$ , we have $P(A_{17}^{c}|\mathcal{F}_{t_{2}})\leq 6\delta$ .

Proof.

First, from part 2 of Proposition 3, on the event $A_{(2)}$ , we have that $X_{0}(t_{2})\leq N-X_{1}(t_{2})-X_{2}(t_{2})\leq 3\delta^{2}N$ . From Proposition 9, the process $(Z_{0}^{[t_{2}]}(t\wedge T_{(3)}),t\geq t_{2})$ is a martingale. Hence, by Lemma 30 and Doob’s maximal inequality, for sufficiently large $N$ , on the event $A_{(2)}$ ,

[TABLE]

which proves the lemma. ∎

Lemma 32.

For sufficiently large $N$ , we have $P(A_{19}^{c}|\mathcal{F}_{t_{2}})\leq\epsilon$ .

Proof.

We will first prove this result in the recombination dominating case. By Proposition 10, the process $\big{(}W_{0r}^{(t_{2},t_{3}]}(t\wedge T_{(3)}),t\geq t_{2}\big{)}$ is a submartingale. Also, note that from the definitions of $t_{2}$ and $t_{3}$ in (146) and (168), we have that

[TABLE]

From Proposition 10, Lemma 30 part 2, (40), and the definition of $T_{5}$ in (170), we have

[TABLE]

Because $1\ll Nr$ and $r\ll s$ , for sufficiently large $N$ ,

[TABLE]

and it follows that

[TABLE]

Also, since $\mu\ll r$ , we have

[TABLE]

Hence, from (184), for sufficiently large $N$ , we have

[TABLE]

Thus, from (185), Lemma 30 part 2 and Doob’s maximal inequality, for sufficiently large $N$ ,

[TABLE]

Now, for the mutation dominating case, we observe that from the definitions of $t_{2}$ and $t_{3}$ in (146) and (168), we have

[TABLE]

From the fact that $1\ll N\mu$ and $\mu\ll s$ , we have

[TABLE]

Also, from $r\ll s$ and (14), we get

[TABLE]

which show that (185) and (186) also hold in this case. By following the same argument as in the recombination dominating case, we obtain that for sufficiently large $N$ ,

[TABLE]

and $P(A_{19}^{c}|\mathcal{F}_{t_{2}})\leq\epsilon$ . ∎

7.2 Results on type 3 individuals

Lemma 33.

For sufficiently large $N$ , we have that for $t\in[t_{2},t_{3}]$ ,

[TABLE]

and $P(A_{20}|\mathcal{F}_{t_{2}})\geq 1-\epsilon$ .

Proof.

The proof is similar to that of Lemma 14. First, recall that the process $\big{(}Z_{3m}^{(t_{2},t_{3}]}(t\wedge T_{(3)}),t\geq t_{2}\big{)}$ is a mean-zero martingale by Proposition 7. By (45), for all $t\geq t_{2}$ , we have

[TABLE]

From (28), Lemma 30 part 3, and the definition of $T_{5}$ in (170), we have that for every $t\in[t_{2},t_{3}]$ ,

[TABLE]

From (185), for sufficiently large $N$ and for all $t\in[t_{2},t_{3}]$ ,

[TABLE]

Also, by Lemma 30 part 3, we have that for all $t\geq t_{2}$ ,

[TABLE]

Therefore, using that $\delta<\frac{1}{4}$ , for sufficiently large $N$ , we have that if $t\in[t_{2},t_{3}]$ , then

[TABLE]

It follows from this inequality, along with (183), (187) and the definition of $C_{3}$ in (167) that in the recombination dominating case, for sufficiently large $N$ ,

[TABLE]

while in the mutation dominating, for sufficiently large $N$ ,

[TABLE]

Thus, by Markov’s inequality, in both cases, we have that $P(A_{20}^{c}|\mathcal{F}_{t_{2}})\leq\epsilon$ . ∎

Lemma 34.

For sufficiently large $N$ , we have that for $t\in[t_{2},t_{3}]$ ,

[TABLE]

and $P(A_{21}|\mathcal{F}_{t_{2}})\geq 1-\epsilon$ .

Proof.

The proof is almost exactly the same as that of Lemma 33. Recall from Proposition 7 that the process $\big{(}Z_{3r}^{(t_{2},t_{3,r}]}(t\wedge T_{(3)}),t\geq t_{2}\big{)}$ is a mean-zero martingale. By (46), for all $t\geq t_{2}$ , we have

[TABLE]

From (39), we have that $R_{3}^{(t_{2},t_{3}]}(u)\leq Nr$ for all $u\geq t_{2}$ . Using the same reason as in (188), for every $t\in[t_{2},t_{3,r}]$ ,

[TABLE]

Also, by Lemma 30 part 4, we have that for all $t\in[t_{2},t_{3}]$ ,

[TABLE]

Therefore, using that $\delta<\frac{1}{4}$ , from (185), we have that if $t\in[t_{2},t_{3}]$ , then

[TABLE]

It follows from this inequality, along with (183), (187) and the definition of $C_{3}$ in (167) that in the recombination dominating case, for sufficiently large $N$ ,

[TABLE]

while in the mutation dominating case, for sufficiently large $N$ ,

[TABLE]

Thus, by Markov’s inequality, in both cases, we have that $P(A_{21}^{c}|\mathcal{F}_{t_{2}})\leq\epsilon$ . ∎

Next, we will bound the probabilities of the events $A_{16},A_{18}$ and $A_{22}$ , but we will need an upper bound for the term $E\big{[}X_{3}^{[t_{2}]}(t\wedge T_{(3)})|\mathcal{F}_{t_{2}}\big{]}$ first.

Lemma 35.

For sufficiently large $N$ , for all $t\geq t_{2}$ , on the event $A_{(2)}$ , we have

[TABLE]

Proof.

From Proposition 9, we know that $\big{(}Z_{3}^{[t_{2}]}(t\wedge T_{(3)}),t\geq t_{2}\big{)}$ is a martingale. So, from (55), Lemma 30 part 3, and the fact that $\delta<\frac{1}{4}$ , for all $t\geq t_{2}$ ,

[TABLE]

Therefore, for all $t\geq t_{2}$ ,

[TABLE]

and from the upper bound of $X_{3}(t_{2})$ on the event $A_{(2)}$ in Proposition 3, the result follows. ∎

Lemma 36.

For sufficiently large $N$ , on the event $A_{(2)}$ , we have $P(A_{18}|\mathcal{F}_{t_{2}})\geq 1-\delta^{2}$ .

Proof.

In the recombination dominating case, from Lemmas 33, 34 and 35, we have

[TABLE]

Because $\mu\ll r$ and $1\ll Nr$ , along with the definition of $C_{3}$ in (167), for sufficiently large $N$ , on the event $A_{(2)}$ , we have

[TABLE]

Thus, by Markov’s inequality, we have $P(A_{18}^{c})\leq\delta^{2}$ .

For the mutation dominating case, we can follow the same argument. Note that in this case, instead of getting (189), Lemma 35 gives that

[TABLE]

Because $1\ll N\mu$ and $r\ll N\mu^{2}$ , for sufficiently large $N$ , on the event $A_{(2)}$ , we have

[TABLE]

and by following the previous argument, we prove the result. ∎

Lemma 37.

For sufficiently large $N$ , on the event $A_{(2)}$ , we have $P(A_{22}|\mathcal{F}_{t_{2}})\geq 1-\epsilon$ .

Proof.

We first consider the recombination dominating case. From Proposition 9, part 3 of Lemma 30, Lemma 35, and (185), for sufficiently large $N$ , on the event $A_{(2)}$ , we have that

[TABLE]

It follows from this inequality and the $L^{2}$ maximal inequality that

[TABLE]

For the mutation dominating case, the argument is exactly the same except at (190), the upper bound from Lemma 35 gives

[TABLE]

and the result follows by applying the $L^{2}$ maximal inequality. ∎

Lemma 38.

For sufficiently large $N$ , on the event $A_{(2)}$ , we have $P(A_{16}|\mathcal{F}_{t_{2}})\geq 1-\delta$ .

Proof.

First, by the definition of $T_{6}$ in (171), we have that

[TABLE]

It follows from this inequality, Markov’s inequality, Lemma 33, and Lemma 34 that for sufficiently large $N$ , on the event $A_{(2)}$ , we have

[TABLE]

At this point, the calculation splits between the two cases. In the recombination dominating case, by (191), (183), Lemma 35, and the definition of $C_{3}$ in (167), we have

[TABLE]

Because $1\ll Nr$ and $\mu\ll N\mu^{2}\ll r\ln(Nr)$ , when $N$ is sufficiently large, on the event $A_{(2)}$ , we have that $P(A_{16}^{c}|\mathcal{F}_{t_{2}})=P(T_{4}=t_{3}\wedge T_{(3)}|\mathcal{F}_{t_{2}})\leq\delta$ .

The proof for the mutation dominating case is almost the same, except at (191), where Lemma 35 gives

[TABLE]

The result follows from the facts that $1\ll N\mu$ and $r\ll N\mu^{2}$ . ∎

We have just finished showing that each of the events $A_{16}$ to $A_{21}$ conditioned on $\mathcal{F}_{t_{2}}$ occurs with probability close to 1 on the event $A_{(2)}$ . In the next step, before we eventually prove Proposition 4, we are going to show that on the event $A_{(3)}$ , we have that $T_{(3)}>t_{3}$ .

Lemma 39.

For sufficiently large $N$ , on the event $A_{(3)}$ , we have that $T_{(3)}>t_{3}$ .

Proof.

In this proof, we are working on the event $A_{(3)}$ . By the definition of event $A_{16}$ in (173), we know that $T_{4}>t_{3}\wedge T_{(3)}$ , and from the ways we define $T_{5}$ and $A_{18}$ as in (170) and (175), we have that $T_{5}>t_{3}\wedge T_{(3)}$ . So, by the definition of $T_{(3)}$ in (172), it is left to show that $T_{6}>t_{3}\wedge T_{(3)}$ .

In the recombination dominating case, by the definitions of the events $A_{17}$ and $A_{19}$ in (174) and (176), if $t\in[t_{2},t_{3}]$ , then

[TABLE]

Since $1\ll Nr$ , for sufficiently large $N$ , we have that $X_{0}\big{(}t\wedge T_{(3)}\big{)}<\delta Ne^{-s(1-3\delta)(t\wedge T_{(3)}-t_{2})}$ , for all $t\in(t_{2},t_{3}]$ . Therefore, by the way we define $T_{6}$ as in (171), we have that $T_{6}>t_{3}\wedge T_{(3)}$ .

For the mutation dominating case, by following the same argument, we have that for all $t\in[t_{2},t_{3}]$ ,

[TABLE]

and the result follows because $r\ll N\mu^{2}$ . ∎

7.3 The proof of Proposition 4

Proof.

Recall the definition of $A_{(3)}$ in (182). From Lemmas 31, 32, 33, 34, 36, 37, and 38, for sufficiently large $N$ , on the event $A_{(2)}$ , we have

[TABLE]

Thus, by Proposition 3, for sufficiently large $N$ ,

[TABLE]

Next, assume that we are on the event $A_{(3)}$ . It follows from Lemma 39 that $T_{(3)}>t_{3}$ when $N$ is sufficiently large. So, by the definition of $T_{6}$ as in (171), we have $X_{0}(t_{3})<\delta Ne^{-s(1-3\delta)(t_{3}-t_{2})}$ , and by using the definition of $t_{3}$ in (168), we prove the first part of the proposition.

For the proof of the second part of the proposition, we define

[TABLE]

We will first prove the recombination dominating case. From (55), the definition of the event $A_{22}$ in (179), and Proposition 3, we have

[TABLE]

Since, $1\ll Nr$ , for sufficiently large $N$ ,

[TABLE]

Hence, from Lemma 30, the definition of $T_{5}$ in (170), inequality (185), and the definition of $K_{3}$ in (192), for sufficiently large $N$ , we have that

[TABLE]

For the upper bound for $X_{3}(t_{3})$ , from (55), the definition of the event $A_{22}$ in (179), Proposition 3, the fact that $\delta<\frac{1}{4}$ , and the definitions of $C_{3}$ in (167), we have

[TABLE]

Since $1\ll Nr$ , for sufficiently large $N$ , we have $X_{3}^{[t_{2}]}(t_{3})\leq\frac{\delta^{2}N}{3}$ . It follows from the definitions of the events $A_{20}$ and $A_{21}$ as defined in (177) and (178), along with the facts that $\mu\ll N\mu^{2}\ll r\ln(Nr)$ and $1\ll Nr$ , that for sufficiently large $N$ , we have $X_{3m}^{(t_{2},t_{3}]}(t_{3})\leq\frac{\delta^{2}N}{3}$ , and $X_{3r}^{(t_{2},t_{3}]}(t_{3})\leq\frac{\delta^{2}N}{3}$ . Therefore, for sufficiently large $N$ , we have $X_{3}(t_{3})\leq\delta^{2}N$ .

We will now consider the mutation dominating case. Following the same argument as in the previous case, due to the differences in the definition of $A_{22}$ and the lower bound of $X_{3}^{[t_{2}]}(t_{3})$ from Proposition 3, instead of having inequality (193), we will have

[TABLE]

Because $1\ll N\mu$ , for sufficiently large $N$ , we have

[TABLE]

and by using the same argument as in the previous case, we have that $X_{3}(t_{3})\geq K_{3}N$ . For the upper bound for $X_{3}(t_{2})$ , due to the differences in the definition of $A_{22}$ and the lower bound of $X_{3}^{[t_{2}]}(t_{3})$ , instead of having inequality (194), we will have

[TABLE]

and because $1\ll N\mu$ , for sufficiently large $N$ , we have $X_{3}^{[t_{2}]}(t_{3})\leq\frac{\delta^{2}N}{3}$ . Lastly, it follows from the definitions of the events $A_{20}$ and $A_{21}$ as defined in (180) and (181), along with the facts that $1\ll N\mu$ and $r\ll N\mu^{2}$ , that for sufficiently large $N$ , we have $X_{3m}^{(t_{2},t_{3}]}(t_{3})\leq\frac{\delta^{2}N}{3}$ , and $X_{3r}^{(t_{2},t_{3}]}(t_{3})\leq\frac{\delta^{2}N}{3}$ . Thus, for sufficiently large $N$ , we have $X_{3}(t_{3})\leq\delta^{2}N$ . ∎

8 Phase 4 and the proof of Proposition 5

The main result in this phase can be proved using Theorem 24 as we did in phase 2. First, we define $\mathbf{X}(t),q,\alpha,\beta,b$ and $\tilde{b}$ as in (136), (137), (134), (139), (140) and (141), respectively. Next, we define a random variable $B^{*}$ such that on the event that $\tilde{X}_{3}(t_{3})>0$ , we have

[TABLE]

The definition of $B^{*}$ when $\tilde{X}_{3}(t_{3})=0$ is not of interest, as we will work only on the event $A_{(3)}$ , on which from Proposition 4, we know that $\tilde{X}_{3}(t_{3})>0$ . Next, for $t\geq t_{3}$ , we define

[TABLE]

and define

[TABLE]

One can check that

[TABLE]

and, for all $t\geq t_{3}$ , we have

[TABLE]

By computation, we obtain that

[TABLE]

and

[TABLE]

which along with (197) imply that

[TABLE]

From (140), (196), (197) and (199), for $t\geq t_{3}$ ,

[TABLE]

Therefore, for $t\geq t_{3}$ , we have $\frac{d}{dt}x^{*}(t)=b(x^{*}(t))$ , and

[TABLE]

Lastly, we define

[TABLE]

where $K_{3}$ is a positive constant that was defined in (192).

Lemma 40.

For sufficiently large $N$ , on the event $A_{(3)}$ , we have $P(A_{23}|\mathcal{F}_{t_{3}})\geq 1-\epsilon$ .

Proof.

The proof is almost exactly the same as the proof of 25. Recall from section 6 that $k$ is a constant not depending on $N$ such that $ks$ is a Lipschitz constant of the function $b$ . We define

[TABLE]

and $L=48/N$ . We also define

[TABLE]

First, we consider the event $\Omega^{*}_{0}$ . From Proposition 4, for sufficiently large $N$ , on the event $A_{(3)}$ , we have $X_{3}(t_{3})>0$ , which means $x^{*}(t)$ is well-defined. So, by (198), for sufficiently large $N$ , on the event $A_{(3)}$ , we have

[TABLE]

From the upper bound of $X_{0}(t_{3})$ in Proposition 4, along with the facts that $r\ln(Nr)\ll s$ in the recombination dominating case and $N\mu^{2}\ll s$ in the mutation dominating case, for sufficiently large $N$ , on the event $A_{(3)}$ we have $|\mathbf{X}(t_{3})-x^{*}(t_{3})|\leq\Delta^{*}$ . So, for sufficiently large $N$ , we have $\Omega_{0}^{*c}\subseteq A_{(3)}^{c}$ .

Next, by similar arguments to those used to prove that $\Omega_{1}^{c}=\emptyset$ and $\Omega_{2}^{c}=\emptyset$ in Proposition 4, for sufficiently large $N$ , we have that $\Omega_{1}^{*c}=\emptyset$ and $\Omega_{2}^{*c}=\emptyset$ . Therefore, by Theorem 24, the definitions of $t_{3}$ and $t_{4}$ in (168) and (201), along with the fact that $1\ll Ns$ , for sufficiently large $N$ , on the event $A_{(3)}$ , we have

[TABLE]

which proves the result. ∎

Here, we will give a proof for Proposition 5.

Proof of Proposition 5.

First, from the definition of $A_{(4)}$ in (203), and Propositions 4 and 40, for sufficiently large $N$ , we have

[TABLE]

From this point, we will work on the event $A_{(4)}$ . From the definition of $B^{*}$ in (195) and Proposition (4), we have

[TABLE]

By the definitions of $f^{*}(t),t_{3},t_{4},C_{3}$ and $C_{4}$ in (196), (168), (201), (167) and (200), respectively, along with the inequality (204), we obtain that

[TABLE]

and

[TABLE]

Note that from Proposition 4, it is clear that $K_{3}\leq\delta^{2}$ . Using this fact, the definitions of $A_{23}$ in (202), the definition of $x^{*}_{3}(t)$ in (197), along with (205) and (206) , we have

[TABLE]

and

[TABLE]

Lastly, using that $K_{3}\leq\delta^{2}$ , the definitions $x^{*}_{3}(t)$ and $A_{23}$ in (197) and (202), along with (199) and (205), we obtain that

[TABLE]

This completes the proof of this lemma. ∎

9 Phase 5 and the proof of Theorem 1

The technique used in the proof involves coupling with a branching process, similar to the proof of Lemma 19. We begin by defining

[TABLE]

First, we will show that with probability close to 1, $T_{7}<T_{8}$ and $T_{7}\leq t_{5+}$ .

Lemma 41.

The following statements hold:

For sufficiently large $N$ , on the event $A_{(4)}$ , we have $P(T_{7}<T_{8}|\mathcal{F}_{t_{4}})\geq 1-\epsilon$ . 2. 2.

For sufficiently large $N$ , on the event $A_{(4)}$ , we have $P(T_{7}\leq t_{5+}|\mathcal{F}_{t_{4}})\geq 1-\epsilon-\delta$ .

Proof.

We are going to consider the process $(N-X_{3}(t),t\geq t_{4})$ . For $t\geq 0$ , let $B(t)$ and $D(t)$ be the rates the this process increases and decreases by 1 at time $t$ . This process increases by 1 when a type 3 individual dies and is replaced by an individual that is not type 3. Type 3 individuals die at total rate of $(1-2s)X_{3}(t)$ , and the probability that the replacement is a type 3 individual is

[TABLE]

Hence, this process increases by 1 at rate

[TABLE]

The process decreases by 1 when an individual that is not of type 3 dies and is replaced by a type 3, or a mutation occurs on a type 1 or 2 individual. This occurs at rate

[TABLE]

Then, for all $t\geq 3$ , we have

[TABLE]

and

[TABLE]

Hence, we can think of the process $(N-X_{3}(t),t\in[t_{4},T_{7}])$ as a birth-death process in which each individual gives birth at rate bounded above by $(1-2s+r)\tilde{X}_{3}(t)$ and dies at rate bounded below by $(1-s-r)\tilde{X}_{3}(t)$ .

Let $(Y(t),t\geq t_{4})$ be a birth-death process in which each individual gives birth at rate $b(t)=(1-2s+r)\tilde{X}_{3}(t)$ , and dies at rate $d(t)=(1-s-r)\tilde{X}_{3}(t)$ , and $Y(0)=N-X_{3}(t_{4})$ . It is possible to couple the process $(Y(t),t\geq t_{4})$ with the process $(N-X_{3}(t),t\geq t_{4})$ such that for any time $t\geq t_{4}$ , we have $Y(t)\geq N-X_{3}(t)$ . This implies that if the process $Y$ reaches 0 before $\lfloor 2\delta^{2}N\rfloor$ , then the process $N-X_{3}$ will also reach 0 before $\lfloor 2\delta^{2}N\rfloor$ , which means that $T_{7}<T_{8}$ .

Here, since we are only interested in the probability that the process $Y$ reaches 0 before $\lfloor 2\delta^{2}N\rfloor$ , we will consider the induced discrete-time jump process of $(Y(t),t\in[t_{4},T_{7}\wedge T_{8}))$ . It is an asymmetric random walk process that jumps up by 1 with probability

[TABLE]

and jumps down by 1 with probability

[TABLE]

On the event $A_{(4)}$ , we have from Proposition 5 that $N-X_{3}(t_{4})\leq 5\delta^{2}N/4$ . Let $q=(1-s-r)/(1-2s+r)$ , and note that because $r\ll s$ , for sufficiently large $N$ , we have

[TABLE]

For sufficiently large $N$ , on the event $A_{(4)}$ , conditioning on the event $N-X_{3}(t_{4})=k$ , the probability that this asymmetric random walk reaches 0 before $\lfloor 2\delta^{2}N\rfloor$ is

[TABLE]

and note that this upper bound is no longer depends on $k$ . Since $s\ll 1$ , when $N\rightarrow\infty$ , we have

[TABLE]

Also, because $Ns\gg 1$ , it follows that when $N\rightarrow\infty$ , we have

[TABLE]

Thus, on the event $A_{(4)}$ , for sufficiently large $N$ , the probability that the asymmetric random walk reaches 0 before $\lfloor 2\delta^{2}N\rfloor$ is bounded below by $1-\epsilon$ . Therefore, through the coupling, for sufficiently large $N$ , on the event $A_{(4)}$ , we have $P(T_{7}<T_{8}|\mathcal{F}_{t_{4}})\geq 1-\epsilon$ .

We will now prove part 2 of this lemma. It follows from part 1 that, for sufficiently large $N$ , on the event $A_{(4)}$ ,

[TABLE]

So, we only need to show that for sufficiently large $N$ , on the event $A_{(4)}$ ,

[TABLE]

Now, for $t\in[0,(T_{7}\wedge T_{8})-t_{4}]$ , we define

[TABLE]

and for $t\in[0,\lambda((T_{7}\wedge T_{8})-t_{4}))$ , we define $Y^{*}(t)=Y(\lambda^{-1}(t))$ . The process $(Y^{*}(t),t\in[0,\lambda((T_{7}\wedge T_{8})-t_{4}))$ is a birth-death process satisfying $Y^{*}(0)=N-X_{3}(t_{4})$ , where each individual gives birth at rate

[TABLE]

and each individual dies at rate

[TABLE]

For sufficiently large $N$ , on the event that $t_{5+}<T_{7}\wedge T_{8}$ , we have

[TABLE]

It follows that,

[TABLE]

By the same reason we obtain (120) which gives the probability that the birth and death process survives until time $t$ , if the process starts with one individual, we can generalize to the process that starts with any finite number of individuals. If $k\leq 5\delta^{2}N/4$ , then

[TABLE]

and note that this upper bound does not depend on $k$ . Now, by using the facts that $r\ll s\ll 1$ and $1\ll Ns$ along with (13), when $N$ is sufficiently large, on the event $A_{(4)}$ , on which we know from Proposition 5 that $Y^{*}(0)=N-X_{3}(t_{4})\leq 5\delta^{2}N/4$ , we have

[TABLE]

Note that when $N\rightarrow\infty$ , by using that $\delta\in(0,\frac{1}{4})$ , we have

[TABLE]

This fact along with (212) prove the inequality (210). ∎

Next, we are going to show that $t_{5-}<T_{7}\wedge T_{8}$ with probability close to 1.

Lemma 42.

The following statements hold:

For sufficiently large $N$ , on the event $A_{(4)}$ , we have $P(t_{5-}<T_{7}\wedge T_{8}|\mathcal{F}_{t_{4}})\geq 1-2\epsilon$ . 2. 2.

For sufficiently large $N$ , we have $P(A_{(5)})\geq 1-29\epsilon-8\delta-\delta^{2}$ .

Proof.

The proof is similar to the proof of Lemma 41. In this proof, we are going to consider the process $(X_{1}(t)+X_{2}(t),t\geq t_{4})$ . For $t\geq t_{4}$ , let $B(t)$ and $D(t)$ be the rates at which the process increases or decreases by 1. We will now give a lower bound for $B(t)$ and an upper bound for $D(t)$ . For the increasing rate, one way to increase $X_{1}(t)+X_{2}(t)$ is by having a type 0 or type 3 individual die, which occurs at the total rate $X_{0}(t)+(1-2s)X_{3}(t)$ , and the new individual is type 1 or 2 that is created without recombination, which occurs with probability $(1-r)(\tilde{X}_{1}(t)+\tilde{X}_{2}(t))$ . Then,

[TABLE]

To decrease $X_{1}(t)+X_{2}(t)$ , one way is by having a type 1 or type 2 die, and this occurs at total rate $(1-s)(X_{1}(t)+X_{2}(t))$ , and the new individual cannot be type 1 or 2, which occurs with probability bounded above by $1-(1-r)(\tilde{X}_{1}(t)+\tilde{X}_{2}(t))$ . Another way to decrease $X_{1}(t)+X_{2}(t)$ by having a type 1 or 2 mutate to type 3, which occurs at rate $\mu(X_{1}(t)+X_{2}(t))$ . So,

[TABLE]

When $t\in[t_{4},T_{7}\wedge T_{8}]$ , we have

[TABLE]

and

[TABLE]

Hence, when $t\in[t_{4},T_{7}\wedge T_{8}]$ ,

[TABLE]

Let $(Y(t),t\geq t_{4})$ be a birth-death process such that $Y(t_{4})=X_{1}(t_{4})+X_{2}(t_{4})$ , in which each individual gives birth at rate $b(t)=(1-2s-r)(\tilde{X}_{0}(t)+\tilde{X}_{3}(t))$ and each individual dies at rate $d(t)=(1-s+r+2\mu)(\tilde{X}_{0}(t)+\tilde{X}_{3}(t))$ . We can couple this process with $(X_{1}(t)+X_{2}(t),t\geq t_{4})$ such that for any $t\in[t_{4},T_{7}\wedge T_{8}]$ , we have $Y(t)\leq X_{1}(t)+X_{2}(t)$ , which means that if $Y(t)>0$ , the $X_{1}(t)+X_{2}(t)>0$ . Now, we consider the induced discrete time jump process of $(Y(t),t\in[t_{4},T_{7}\wedge T_{8}])$ . It is an asymmetric walk that jumps up with probability

[TABLE]

and jumps down with probability

[TABLE]

Next, for $t\in[0,(T_{7}\wedge T_{8})-t_{4}]$ , we define

[TABLE]

Since $\tilde{X}_{0}(t_{4}+v)+\tilde{X}_{3}(t_{4}+v)\leq 1$ for all $v\geq 0$ , it follows that for $t\in[0,(T_{7}\wedge T_{8})-t_{4}]$ , we have $\lambda(t)\leq t$ . Now, we define $Y^{*}(t)=Y(\lambda^{-1}(t))$ . It follows that the process $(Y^{*}(t),t\in[0,\lambda((T_{7}\wedge T_{8})-t_{4}])$ is a birth-death process such that $Y^{*}(0)=X_{1}(t_{4})+X_{2}(t_{4})$ , in which each individual gives birth at rate

[TABLE]

and each individual dies at rate

[TABLE]

With these birth and death rates, we can extend the process $Y^{*}$ to be the birth-death process that is defined for all times $t\in[0,\infty)$ , where the rates at which each individual gives birth and dies are given in (213) and (214), respectively.

We will first show that for sufficiently large $N$ , on the event $A_{(4)}$ ,

[TABLE]

Similar to the way we get (211), if $k\geq\frac{K_{3}N}{2}$ , then

[TABLE]

and note that this lower bound does not depend on $k$ . Note that from Proposition 5, we know that on the event $A_{(4)}$ , we have $Y^{*}(0)=Y(t_{4})=X_{1}(t_{4})+X_{2}(t_{4})\geq K_{3}N/2$ . Using the facts that $\mu\ll s$ , $r\ll s$ , $s\ll 1$ and using (13), when $N$ is sufficiently large, on the event $A_{(4)}$ ,

[TABLE]

Note that because $1\ll Ns$ , when $N\rightarrow\infty$ ,

[TABLE]

This fact along with (216) proves (215).

Lastly, by using the couplings and from part 1, the fact that $\lambda(t)\leq t$ , part 1 of Lemma 41, and the definition of $T_{7}$ in (209), for sufficiently large $N$ , on the event $A_{(4)}$ ,

[TABLE]

Therefore, for sufficiently large $N$ , on the event $A_{(4)}$ , we have $P(t_{5-}<T_{7}\wedge T_{8}|\mathcal{F}_{t_{4}})\geq 1-2\epsilon$ .

Lastly, to prove part 2, by using part 2 of Lemma 41 and part 1 of this lemma, for sufficiently large $N$ , on the event $A_{(4)}$ , we have that $P(t_{5-}<T_{7}<t_{5+}|\mathcal{F}_{t_{4}})\geq 1-3\epsilon-\delta$ . With this fact and Proposition 5, for sufficiently large $N$ , we have $P(A_{(5)})=P(A_{(4)}\cap\{t_{5-}<T_{7}<t_{5+}\})\geq 1-29\epsilon-8\delta-\delta^{2}$ . ∎

Proof of Theorem 1.

First, for every subsequence $(N_{k})_{k=1}^{\infty}$ , there is a further subsequence that satisfies (6), or there is a further subsequence that satisfies (7). By a subsequence argument, it is enough to prove Theorem 1 in the recombination dominating case and the mutation dominating case. Now, recall that the stopping time $T$ defined in Theorem 1 is the first time that type 3 individuals have fixated in the population. We will show that if $\theta\in(0,1)$ , then for sufficiently large $N$ , we have

[TABLE]

We choose $\delta$ to be small enough so that 1) $\delta<\epsilon$ , 2) $(1-\delta^{2})^{-1}<1+\theta$ and 3) $1-2\delta>1-\theta$ . From part 2 of Lemma 42, for sufficiently large $N$ , we have $P(A_{(5)})\geq 1-29\epsilon-8\delta-\delta^{2}\geq 1-38\epsilon$ . Note that from the definition of $T_{7}$ in (209), we have $T_{7}=T\vee t_{4}$ . Also, by the definition of $t_{5-}$ and the fact that $1\ll N\mu\ll Ns$ , for sufficiently large $N$ , we have $t_{5-}>t_{4}$ . Thus, for sufficiently large $N$ , we have

[TABLE]

It is enough to show that $(1-\theta)t^{*}_{N}(r_{N})\leq t_{5-}$ and $t_{5+}<(1+\theta)t^{*}_{N}(r_{N})$ .

Recall the definition of $t_{N}^{*}$ in (5). Because of (6), in the recombination dominating case, for sufficiently large $N$ ,

[TABLE]

Next, in the mutation dominating case,

[TABLE]

and because of (7), we have

[TABLE]

From the definitions of $t_{4}$ and $t_{5+}$ in (201) and (207), we have that

[TABLE]

From (217) and (219), we have

[TABLE]

Because $1\ll N\mu_{N}\ll Ns_{N}$ and $\mu_{N}\ll N\mu^{2}_{N}\ll s_{N}$ , along with $r_{N}\ln_{+}(Nr_{N})\ll s_{N}$ , we have

[TABLE]

From (220) and the way we choose $\theta$ , for sufficiently large $N$ ,

[TABLE]

By a similar argument, from the definitions of $t_{4}$ and $t_{5-}$ in (201) and (208), we have that

[TABLE]

From (217), (218), and (221), for sufficiently large $N$ , we have

[TABLE]

which completes the proof. ∎

Acknowledgement

I am grateful to Professor Jason Schweinsberg for teaching me several techniques, especially the one from one of his previous papers, which was used intensively in my proof. I also want to thank him for spending time reviewing this manuscript several times, which improved this manuscript significantly.

Bibliography16

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] K. B. Athreya, P. E. Ney (1972). Branching Processes . Springer-Verlag, Berlin.
2[2] N. Berestycki, L. Z. Zhao (2013). The shape of multidimensional Brunet-Derrida particle systems. ar Xiv:1305.0254 .
3[3] S. Bossert, P. Pfaffelhuber (2016). The fixation probability and time for a doubly beneficial mutant. ar Xiv:1610.06613 .
4[4] J. F. Crow, M. Kimura (1965). Evolution in sexual and asexual populations. Am. Nat. 99 , 439-450.
5[5] C. Cuthbertson, A. Etheridge, F. Yu (2012). Fixation probability for competing selective sweeps. Electron. J. Probab. 17 , 1-36.
6[6] R. W. R. Darling, J. R. Norris (2008). Differential equations approximation for Markov chains. Probability Surveys 5 , 37-79.
7[7] R. Durrett (2008). Probability Models for DNA Sequence Evolution . Springer, New York.
8[8] R. A. Fisher (1930). The Genetical Theory of Natural Selection . Clarendon Press, Oxford.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

The Effect of Recombination on the Speed of Evolution

Abstract

1 Introduction

1.1 The model

1.2 Previous works

1.3 Conditions of the parameters

1.4 Main theorem

Theorem 1**.**

2 Overview of the proof

Proposition 2**.**

Proposition 3**.**

Proposition 4**.**

Proposition 5**.**

3 On parameters and transition rates of the process

3.1 More inequalities on the parameters

Lemma 6**.**

Proof.

3.2 Transition rates of the process

4 Important Martingales and Submartingales

Proposition 7**.**

Proof.

Corollary 8**.**

Proposition 9**.**

Proposition 10**.**

Proof.

5 Phase 1 and the proof of Proposition 2

5.1 Notations

Lemma 11**.**

Proof.

5.2 Upper bounds for expectations

Lemma 12**.**

Proof.

Lemma 13**.**

Proof.

Lemma 14**.**

Proof.

Lemma 15**.**

Proof.

Lemma 16**.**

Proof.

5.3 The variance bounds

Lemma 17**.**

Proof.

5.4 Results on type 3 individuals

Lemma 18**.**

Proof.

Lemma 19**.**

Proof.

Lemma 20**.**

Proof.

Lemma 21**.**

Proof.

Lemma 22**.**

Proof.

Lemma 23**.**

Proof.

5.5 The proof of Proposition 2

Proof.

6 Phase 2 and the proof of Proposition 3

6.1 Comparing the Markov chain with a differential equation

Theorem 24**.**

Lemma 25**.**

Proof.

6.2 Results on type 3 individuals

Lemma 26**.**

Proof.

Lemma 27**.**

Proof.

Lemma 28**.**

Proof.

Lemma 29**.**

Proof.

6.3 The proof of Proposition 3

Proof.

Theorem 1.

Proposition 2.

Proposition 3.

Proposition 4.

Proposition 5.

Lemma 6.

Proposition 7.

Corollary 8.

Proposition 9.

Proposition 10.

Lemma 11.

Lemma 12.

Lemma 13.

Lemma 14.

Lemma 15.

Lemma 16.

Lemma 17.

Lemma 18.

Lemma 19.

Lemma 20.

Lemma 21.

Lemma 22.

Lemma 23.

Theorem 24.

Lemma 25.

Lemma 26.

Lemma 27.

Lemma 28.

Lemma 29.

Lemma 30.

Lemma 31.

Lemma 32.

Lemma 33.

Lemma 34.

Lemma 35.

Lemma 36.

Lemma 37.

Lemma 38.

Lemma 39.

Lemma 40.

Lemma 41.

Lemma 42.