A scaling limit for the length of the longest cycle in a sparse random   graph

Michael Anastos; Alan Frieze

arXiv:1907.03657·math.CO·January 10, 2020·J. Comb. Theory B

A scaling limit for the length of the longest cycle in a sparse random graph

Michael Anastos, Alan Frieze

PDF

TL;DR

This paper investigates the asymptotic behavior of the longest cycle in sparse random graphs, establishing a limiting function for its normalized length as the graph size grows, especially for large average degrees.

Contribution

It introduces a new limiting function for the longest cycle length in sparse random graphs and provides explicit formulas for initial polynomial coefficients.

Findings

01

Longest cycle length converges to a function f(c) of the average degree c.

02

For large c, the normalized longest cycle length approaches f(c).

03

The same asymptotic applies to the longest path in the graph.

Abstract

We discuss the length of the longest cycle in a sparse random graph $G_{n, p}, p = c / n$ . $c$ constant. We show that for large $c$ there is a function $f (c)$ such that $L_{n} (c) / n \to f (c)$ a.s. The function $f (c) = 1 - \sum_{k = 1}^{\infty} p_{k} (c) e^{- k c}$ where $p_{k}$ is a polynomial in $k$ . We are only able to explicitly give the values $p_{1}, p_{2}$ , although we could in principle compute any $p_{k}$ . We see immediately that the length of the longest path is also asymptotic to $f (c) n$ w.h.p.

Equations144

υ_{0} (T) denote the set of vertices of T that have no neighbors outside S_{L} .

υ_{0} (T) denote the set of vertices of T that have no neighbors outside S_{L} .

L_{c, n} \approx ∣ V (C_{2}) ∣ - T \in T \sum ϕ (T) .

L_{c, n} \approx ∣ V (C_{2}) ∣ - T \in T \sum ϕ (T) .

∣ C_{2} ∣

∣ C_{2} ∣

∣ E (C_{2}) ∣

x = k = 1 \sum \infty \frac{k ^{k - 1}}{k !} (c e^{- c})^{k} = c e^{- c} + c^{2} e^{- 2 c} + O (c^{3} e^{- 3 c}) .

x = k = 1 \sum \infty \frac{k ^{k - 1}}{k !} (c e^{- c})^{k} = c e^{- c} + c^{2} e^{- 2 c} + O (c^{3} e^{- 3 c}) .

T \in T \sum ϕ (T) = O (c^{6} e^{- 3 c}) n .

T \in T \sum ϕ (T) = O (c^{6} e^{- 3 c}) n .

L_{c, n} \approx (1 - (c + 1) e^{- c} - c^{2} e^{- 2 c} + O (c^{6} e^{- 3 c})) n .

L_{c, n} \approx (1 - (c + 1) e^{- c} - c^{2} e^{- 2 c} + O (c^{6} e^{- 3 c})) n .

\frac{E [ L _{c, n} ]}{n} - f (c) \leq ϵ .

\frac{E [ L _{c, n} ]}{n} - f (c) \leq ϵ .

\frac{L _{c, n}}{n} \to f (c) a . s .

\frac{L _{c, n}}{n} \to f (c) a . s .

s = 4 \sum n_{0} (s n) (3 s /2 ( 2 s )) (\frac{c}{n})^{3 s /2} \leq s = 4 \sum n_{0} (\frac{n e}{s} \cdot (\frac{se}{3})^{3/2} \cdot (\frac{c}{n})^{3/2})^{s} = s = 4 \sum n_{0} (\frac{e ^{5/2} c ^{3/2} s ^{1/2}}{3 ^{3/2} n ^{1/2}})^{s} = o (1) .

s = 4 \sum n_{0} (s n) (3 s /2 ( 2 s )) (\frac{c}{n})^{3 s /2} \leq s = 4 \sum n_{0} (\frac{n e}{s} \cdot (\frac{se}{3})^{3/2} \cdot (\frac{c}{n})^{3/2})^{s} = s = 4 \sum n_{0} (\frac{e ^{5/2} c ^{3/2} s ^{1/2}}{3 ^{3/2} n ^{1/2}})^{s} = o (1) .

\frac{e ( S _{k} )}{∣ S _{k} ∣} \geq \frac{4 D n _{d}}{2} \cdot \frac{1}{( D + 12 ) n _{d}} > \frac{3}{2} .

\frac{e ( S _{k} )}{∣ S _{k} ∣} \geq \frac{4 D n _{d}}{2} \cdot \frac{1}{( D + 12 ) n _{d}} > \frac{3}{2} .

∣ V (Γ_{L}) ∣ \leq (D + 4) n_{D} \leq n e^{- c /2} .

∣ V (Γ_{L}) ∣ \leq (D + 4) n_{D} \leq n e^{- c /2} .

V_{1} = C_{2} ∖ S_{L} and V_{2} = {v \in S_{L} : v has at least one neighbor in V_{1}} .

V_{1} = C_{2} ∖ S_{L} and V_{2} = {v \in S_{L} : v has at least one neighbor in V_{1}} .

υ_{0} (T) = V (T) ∖ V_{2} .

υ_{0} (T) = V (T) ∖ V_{2} .

∣ υ_{0} (K) ∣ \geq \frac{∣ V ( K ) ∣}{3} .

∣ υ_{0} (K) ∣ \geq \frac{∣ V ( K ) ∣}{3} .

∣ υ_{0, i} (K) ∣ \geq \frac{∣ V ( K ) ∣}{3} .

∣ υ_{0, i} (K) ∣ \geq \frac{∣ V ( K ) ∣}{3} .

υ_{0, ℓ + 1} (K^{'}) \geq 1 + j \in [r] \sum υ_{0, ℓ} (K_{j}) \geq 1 + \frac{1}{3} j \in [r] \sum ∣ K_{j} ∣ \geq 1 + \frac{∣ K ^{'} ∣ - 3}{3} = \frac{∣ K ^{'} ∣}{3} .

υ_{0, ℓ + 1} (K^{'}) \geq 1 + j \in [r] \sum υ_{0, ℓ} (K_{j}) \geq 1 + \frac{1}{3} j \in [r] \sum ∣ K_{j} ∣ \geq 1 + \frac{∣ K ^{'} ∣ - 3}{3} = \frac{∣ K ^{'} ∣}{3} .

(k n) k^{k - 2} (\frac{c}{n})^{k - 1} (k /3 k) (1 - p)^{k (n - k) /3}

(k n) k^{k - 2} (\frac{c}{n})^{k - 1} (k /3 k) (1 - p)^{k (n - k) /3}

\leq \frac{n}{c k ^{2}} (2 c e^{1 - c /6})^{k} = o (n^{- 2}),

k = 3 \sum l o g n (k n) k^{k + 1} (\frac{c}{n})^{k} (k /3 k) (1 - p)^{k (n - k) /3}

k = 3 \sum l o g n (k n) k^{k + 1} (\frac{c}{n})^{k} (k /3 k) (1 - p)^{k (n - k) /3}

\leq k = 3 \sum l o g n k (2 c e^{1 - c /6})^{k} = O (1) .

H^{*}

H^{*}

E (T \in T \sum ϕ (T))

E (T \in T \sum ϕ (T))

\leq k \geq 7 \sum (\frac{n e}{k})^{k} k^{k - 1} (\frac{c}{n})^{k - 1} exp {- c max {3, k /3}}

= O (c^{6} e^{- 3 c}) n,

N \geq n (1 - 2 e^{- c /2}) and M \in \frac{( 1 \pm ε _{1} ) c N}{2},

N \geq n (1 - 2 e^{- c /2}) and M \in \frac{( 1 \pm ε _{1} ) c N}{2},

Pr (\exists S : ∣ S ∣ = N, e (S) \in / (1 \pm ε_{1}) (2 N) p) \leq 2 (N n) exp {- \frac{ε _{1}^{2} N ( N - 1 ) p}{3}} = o (1) .

Pr (\exists S : ∣ S ∣ = N, e (S) \in / (1 \pm ε_{1}) (2 N) p) \leq 2 (N n) exp {- \frac{ε _{1}^{2} N ( N - 1 ) p}{3}} = o (1) .

N_{b} (S)

N_{b} (S)

= {w \in V_{1} ∖ S : \exists v \in S with {v, w} \in E (Γ_{b})}

k = 1 \sum 3 n_{1} (k n) k^{k - 2} (2 ( 2 k )) p^{k + 1} (3 k /10 k) (ℓ = 1 \sum 99 (99 n - k) p^{ℓ} (1 - p)^{n - k - ℓ})^{3 k /10} \leq k = 1 \sum 3 n_{1} (\frac{n e}{k})^{k} k^{k + 2} (\frac{c}{n})^{k + 1} 2^{k} e^{- 3 k c /20} \leq k = 1 \sum 3 n_{1} \frac{c k ^{2}}{n} \cdot (2 c e^{1 - 3 c /20})^{k} = o (1) .

k = 1 \sum 3 n_{1} (k n) k^{k - 2} (2 ( 2 k )) p^{k + 1} (3 k /10 k) (ℓ = 1 \sum 99 (99 n - k) p^{ℓ} (1 - p)^{n - k - ℓ})^{3 k /10} \leq k = 1 \sum 3 n_{1} (\frac{n e}{k})^{k} k^{k + 2} (\frac{c}{n})^{k + 1} 2^{k} e^{- 3 k c /20} \leq k = 1 \sum 3 n_{1} \frac{c k ^{2}}{n} \cdot (2 c e^{1 - 3 c /20})^{k} = o (1) .

Pr (E_{S} ∣ s, t, w)

Pr (E_{S} ∣ s, t, w)

\displaystyle\leq\binom{n}{s}\binom{n}{t}\binom{n}{w}\binom{\binom{s+t}{2}}{s+t}s^{w}p^{s+t+w}\bigg{(}1-\frac{p}{2000}\bigg{)}^{s(n-s-t-w)}

\leq (\frac{e n}{s})^{s} (\frac{e n}{t})^{t} (\frac{e n}{w})^{w} (\frac{e ( s + t )}{2})^{s + t} s^{w} (\frac{c}{n})^{s + t + w} exp {- \frac{p}{2000} (\frac{s n}{5})}

\leq (ec)^{2 (s + t)} (\frac{s + t}{2 s})^{s} (\frac{s + t}{2 t})^{t} (\frac{ecs}{w})^{w} exp {- \frac{cs}{1 0 ^{5}}}

\leq (ec)^{6 s} exp {s \cdot \frac{t - s}{2 s}} exp {t \cdot \frac{s - t}{2 t}} (\frac{ecs}{n _{0}})^{n_{0}} exp {- \frac{cs}{1 0 ^{5}}}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

A scaling limit for the length of the longest cycle in a sparse random graph

Michael Anastos and Alan Frieze Department of Mathematics and Computer Science, Freie Universität Berlin, Berlin, Germany, email:[email protected]Department of Mathematical Sciences, Carnegie Mellon University, Pittsburgh PA, U.S.A. email:[email protected]; the author is supported in part by NSF Grant DMS1363136

Abstract

We discuss the length of the longest cycle in a sparse random graph $G_{n,p},p=c/n$ . $c$ constant. We show that for large $c$ there exists a function $f(c)$ such that $L_{c,n}/n\to f(c)$ a.s. The function $f(c)=1-\sum_{k=1}^{\infty}p_{k}(c)e^{-kc}$ where $p_{k}$ is a polynomial in $c$ . We are only able to explicitly give the values $p_{1},p_{2}$ , although we could in principle compute any $p_{k}$ . We see immediately that the length of the longest path is also asymptotic to $f(c)n$ w.h.p.

1 Introduction

There are several basic questions that can be asked in the context of a class of graphs. E.g. what is the chromatic number? Is the graph Hamiltonian? Another such basic question is the following: how long is the longest cycle? In this paper we study this question in relation to the sparse random graph $G_{n,p},p=c/n$ for a constant $c>0$ . Thus, let $L_{c,n}$ denote the length of the longest cycle in the random graph $G_{n,c/n}$ . Erdős [11] conjectured that if $c>1$ then w.h.p. $L_{c,n}\geq\ell(c)n$ where $\ell(c)>0$ is independent of $n$ . This was proved by Ajtai, Komlós and Szemerédi [1] and in a slightly weaker form by de la Vega [25] who proved that if $c>4\log 2$ then $f(c)=1-O(c^{-1})$ . See also Suen [24]. Although this answered Erdős’s question it only gives us a lower bound for the length of the longest cycle. Bollobás [4] realised that for large $c$ one could find a large path/cycle w.h.p. by concentrating on a large subgraph with large minimum degree and demonstrating Hamiltonicity. In this way he showed that $\ell(c)\geq 1-c^{24}e^{-c/2}$ . This was then improved by Bollobás, Fenner and Frieze [8] to $\ell(c)\geq 1-c^{6}e^{-c}$ and then by Frieze [16] to $\ell(c)\geq 1-(1+\varepsilon_{c})(1+c)e^{-c}$ where $\varepsilon_{c}\to 0$ as $c\to\infty$ . This last result is optimal up to the value of $\varepsilon_{c}$ , as there are w.h.p. $\approx(1+c)e^{-c}n$ vertices of degree 0 or 1.

The basic open question to this point, is at to whether or not there exists a function $f(c)$ such that w.h.p. the $L_{c,n}=(1+\varepsilon_{n})f(c)n$ where $\varepsilon_{n}\to 0$ as $n\to 0$ . And what is $f(c)$ . In this paper we establish the existence of $f(c)$ for large $c$ and give a method of computing it to arbitrary accuracy. We note that this is one case of a fundamental extremal random variable where the existence of a scaling limit has not previously been shown to exist and does not appear to be susceptible to the interpolation method as in Bayati, Gamarnik and Tetali [3].

Let $p=c/n$ and let $G=G_{n,p}$ . We will assume throughout that $c$ is sufficiently large. Let $C_{2}$ denote the 2-core of $G$ . By this we mean that part of the giant component consisting of vertices that are in at least one cycle. The longest cycle in $G$ is contained in $C_{2}$ and the length of the longest path in $G_{n,c/n}$ differs from this by $O(\log n)$ w.h.p. This will be for two reasons. The first reason is that we will establish a Hamiltonian subgraph of $C_{2}$ that contains the longest path in $C_{2}$ and the second reason for this is that w.h.p. the giant component of $G$ consists of $C_{2}$ plus a forest of trees with maximum diameter $O(\log n)$ .

As in the papers, [4], [8] and [16] we consider a process that builds a large Hamiltonian subgraph. We construct a sequence of sets $S_{0}=\emptyset,S_{1},S_{2},\ldots,S_{L}\subseteq C_{2}$ and their induced subgraphs $\Gamma_{0},\Gamma_{1},\Gamma_{2},\ldots,\Gamma_{L}$ . Suppose now that we have constructed $S_{\ell}$ , $\ell\geq 0$ . We construct $S_{\ell+1}$ from $S_{\ell}$ via one of two cases:

**Construction of $\Gamma_{L}$

Case a:** If there is $v\in S_{\ell}$ that has exactly one or two neighbors $W$ in $C_{2}\setminus S_{\ell}$ , then we add $W$ to $S_{\ell}$ to make $S_{\ell+1}$ .

Case b: If there is a vertex $v\in C_{2}\setminus S_{\ell}$ that has at most two neighbors in $C_{2}\setminus S_{\ell}$ then we define $S_{\ell+1}$ to be $S_{\ell}$ plus $v$ plus the neighbors of $v$ in $C_{2}\setminus S_{\ell}$ .

$S_{L}$ is the set we end up with when there are no more vertices to add. We note that $S_{L}$ is well-defined and does not depend on the order of adding vertices. Indeed, suppose we have two distinct outcomes $O_{1}=v_{1},v_{2},\ldots,v_{r}$ and $O_{2}=w_{1},w_{2}.,\ldots,w_{s}$ . Assume without loss of generality that there exists $i$ which is the smallest index such that $w_{i}\notin O_{1}$ . Then, $X=\left\{w_{1},w_{2},\ldots,w_{i-1}\right\}\subseteq Y=\left\{v_{1},v_{2},\ldots,v_{r}\right\}$ . If $w_{i}$ was added in Step a as the neighbor of $v\in S_{\ell}=X$ then $v\in Y$ and $v$ has at most two neighbors in $C_{2}\setminus Y$ . This contradicts the fact that $w_{i}\notin Y$ . Suppose then that $w_{i}$ is added in Step b. If $w_{i}=v$ then it has at most two neighbors in $C_{2}\setminus X$ and hence it has at most two neighbors in $C_{2}\setminus Y$ . This contradicts the fact that $w_{i}\notin Y$ . If $w_{i}$ is the neighbor of $v\in X\subseteq Y$ then we get the same contradiction. It follows that $\left\{w_{1},w_{2}.,\ldots,w_{s}\right\}\subseteq\left\{v_{1},v_{2},\ldots,v_{r}\right\}$ and vice-versa, by the same reasoning.

We will argue below in Section 1.1 that w.h.p. the graph $\Gamma_{L}$ induced by $S_{L}$ is a forest plus a few small components. Each tree in $\Gamma_{L}$ will w.h.p. have at most $\log n$ vertices. For a tree component $T$ let

[TABLE]

Notation 1: Let $\mathcal{T}$ denote the set of trees in $\Gamma_{L}$ . For a tree $T\in\mathcal{T}$ let $\mathcal{P}_{T}$ be the set of vertex disjoint path packings of $T$ where we allow only paths whose start- and end- vertex are have neighbors in $C_{2}\setminus V(T)$ . Here we allow paths of length 0, so that a single vertex with neighbors in $C_{2}\setminus V(T)$ counts as a path. For $P\in\mathcal{P}_{T}$ let $n(T,P)$ be the number of vertices in $T$ that are not covered by $P$ . Let $\phi(T)=\min_{P\in\mathcal{P}_{T}}n(T,P)$ and $\mathcal{Q}(T)\in\mathcal{P}$ denote a set of paths that leaves $\phi(T)$ vertices of $T$ uncovered i.e. satisfies $n(T,Q(T))=\phi(T)$ .

If $A=A(n),B=B(n)$ then we write $A\approx B$ if $A=(1+o(1))B$ as $n\to\infty$ .

We will prove

Theorem 1.1.

Let $p=c/n$ where $c>1$ is a sufficiently large constant. Then w.h.p.

[TABLE]

The size of $C_{2}$ is well-known. Let $x$ be the unique solution of $xe^{-x}=ce^{-c}$ in $(0,1)$ . Then w.h.p. (see e.g. [19], Lemma 2.16),

[TABLE]

Equation (4.5) of Erdős and Rényi [12] tells us that

[TABLE]

We will argue below that w.h.p., as $c$ grows, that

[TABLE]

We therefore have the following improvement to the estimate in [16].

Corollary 1.2.

W.h.p., as $c$ grows, that

[TABLE]

Note the term $(c+1)e^{-c}$ which accounts for vertices of degree 0 or 1. In principle we can compute more terms than what is given in (6). We claim next that there exists some function $f(c)$ such that the sum in (1) is concentrated around $f(c)n$ . In other words, the sum in (1) has the form $\approx f(c)n$ w.h.p.

Theorem 1.3.

(a)

There exists a function $f(c)$ such that for any $\epsilon>0$ , there exists $n_{\varepsilon}$ such that for $n\geq n_{\varepsilon}$ ,

[TABLE] 2. (b)

[TABLE]

We will prove Theorem 1.3 in Section 3.

1.1 Structure of $\Gamma_{L}$ :

We first bound the size of $S_{L}$ . We need the following lemma on the density of small sets.

Lemma 1.4.

W.h.p., every set $S\subseteq[n]$ of size at most $n_{0}=n/10c^{3}$ contains less than $3|S|/2$ edges in $G_{n,p}$ .

Proof.

The expected number of sets invalidating the claim can be bounded by

[TABLE]

∎

Now consider the construction of $S_{L}$ . Let $A$ be the set of the vertices with degree less than $D=100$ and let $S_{0}^{\prime}=(A\cup N(A))\cap S_{L}\subseteq S_{L}$ . If we start with $S_{0}=S_{0}^{\prime}$ and run the process for constructing $\Gamma_{L}$ then we will producee the same $S_{L}$ as if we had started with $S_{0}=\emptyset$ . This is because, as we have shown, the order of adding vertices does not matter. Now w.h.p. there are at most $n_{D}=\frac{2c^{D}e^{-c}}{D!}n$ vertices of degree at most $D$ in $G_{n,p}$ , (see for example Theorem 3.3 of [19]) and so $|S_{0}^{\prime}|\leq Dn_{D}$ .

Now suppose that the process runs for another $k$ rounds. Then $S_{k}$ has a least $kD/2$ edges and at most $Dn_{D}+3k$ vertices. This is because round $k$ adds at most three new vertices to $S_{k}$ and the $k$ vertices that take the role of $v$ have degree at least $D$ and all of their neighbors will be in $S_{k}$ . If $k$ reaches $4n_{D}$ then

[TABLE]

So, by Lemma 1.4, we can assert that w.h.p. the process runs for less than $4n_{D}$ rounds and,

[TABLE]

We note the following properties of $S_{L}$ . Let

[TABLE]

Then,

G1

Each vertex $v\in S_{L}\setminus V_{2}$ has no neighbors in $V_{1}$ . 2. G2

Each $v\in V_{1}\cup V_{2}$ has at least $3$ neighbors in $V_{1}$ .

Given the definition of $V_{2}$ , for $T\in\mathcal{T}$ we can express $\upsilon_{0}(T)$ as

[TABLE]

We will now show that w.h.p. each component $K$ of $\Gamma_{L}$ satisfies

[TABLE]

We will prove that for $0\leq i\leq L$ and each component $K$ spanned by $S_{i}$ ,

[TABLE]

Here $v_{0,i}(K)$ is taken to be the number of vertices in $V(K)$ with no neigbors in $C_{2}\setminus K$ . Taking $i=L$ in (10) yields (9). We proceed by an induction on $i$ .

$S_{0}=\emptyset$ and so for $i=0$ , (10) is satisfied by every component spanned by $S_{0}$ . Suppose that at step $i=\ell$ , (10) is satisfied by every component spanned by $S_{\ell}$ .

At step $\ell+1$ , assume that $v$ invokes either Case a or Case b. In both cases $S_{\ell+1}=S_{\ell}\cup\big{(}\{v\}\cup N(v)\big{)}.$ The addition of the new vertices into $S_{\ell}$ could merge components $K_{1},K_{2},\ldots,K_{r}$ into one component $K^{\prime}$ while adding at most $3$ vertices. Hence $3+\sum_{j\in[r]}|K_{i}|\geq|K^{\prime}|$ . In addition every vertex that contributed to $v_{0,\ell}(K_{j})$ , $j=1,2,...,r$ now contributes towards $v_{0,\ell+1}(K^{\prime})$ . Also $v$ has neighbors outside $S_{\ell}$ but no neigbors outside $S_{\ell+1}$ . The inductive hypothesis implies that $\upsilon_{0,\ell}(K_{j})\geq|K_{j}|/3$ for $j\in[r]$ . Thus,

[TABLE]

and so (10) continues to hold for all the components spanned by $S_{\ell+1}$ .

We next show that w.h.p., only a small component $K$ can satisfy (9). We consider $K$ in the context of $G_{n,p}$ in which case $K$ will have at least $|V(K)|/3$ vertices with no neighbors outside $K$ . So, the expected number of components of size $k\leq ne^{-c/2}$ that satisfy this condition is at most

[TABLE]

if $c$ is large and $k\geq\log n$ .

So, we can assume that all components are of size at most $\log n$ . Then the expected number of vertices on components that are not trees is bounded by

[TABLE]

Markov’s inequality implies that w.h.p. such components span at most $\log n=o(n)$ vertices.

Notation 2: For $T\in\mathcal{T}$ , let $M_{T}$ be the matching on $V_{2}$ obtained by replacing each path of $\mathcal{Q}(T)$ of length at least 1 by an edge and let $M^{*}=\bigcup_{T\in\mathcal{T}}M_{T}$ . Let $I(T)$ denote the internal vertices of the paths $\mathcal{Q}(T)$ and $I^{*}=\bigcup_{T\in\mathcal{T}}I_{T}^{*}$ and $V_{2}^{*}=V_{2}\setminus I^{*}$ . We let $\Gamma^{*}_{1}$ be the subgraph of $G$ induced by $V_{1}$ . We also let $\Gamma^{*}_{2}$ be the bipartite graph with vertex partition $V_{1},V_{2}^{*}$ and all edges $\{e\in E(G):e\in V_{1}\times V_{2}^{*}\}$ . Finally let $\Gamma^{*}=\Gamma_{1}^{*}\cup\Gamma^{*}_{2}\cup M^{*}$ and $V^{*}=V_{1}\cup V_{2}^{*}=V(\Gamma^{*})$ .

2 Proof of Theorem 1.1

The RHS of (1), modulo the $o(n)$ number of vertices that are spanned by non tree components in $\Gamma_{L}$ , is clearly an upper bound on the largest cycle in $C_{2}$ . Any cycle must omit at least $\phi(T)$ vertices from each $T\in\mathcal{T}$ . On the other hand, as we show, w.h.p. there is cycle $H$ that spans $V_{1}\cup\bigcup_{T\in\mathcal{T}}V(\mathcal{Q}(T))$ (see Notation 1). The length of $H$ is equal to the RHS of (1). Equivalently, we show that

[TABLE]

2.1 Proof of (5)

We are not able at this time to give a simple estimate of $\sum_{T\in\mathcal{T}}\phi(T)$ as a function of $c$ . We will have to make do with (5). On the other hand, $\sum_{T\in\mathcal{T}}\phi(T)$ can be approximated to within arbitrary accuracy, using the argument in Section 3.

We work in $G_{n,p}$ . Observe that $T$ must have a vertex of degree three in order that $\phi(T)>0$ . The smallest such tree has seven vertices and consists of three paths of length two with a common endpoint. (If $T$ is a star of degree 3 for example, it can be covered by a path of length 2 that covers the central vertex and a path of length 0. Here we are using that every vertex in $V(T)\setminus V_{2}\subset C_{2}$ must have degree at least 2, hence every vertex of $T$ of degree 1 belongs to $V_{2}$ and has neighbors in $C_{2}\setminus V(T)$ .) Therefore, in $G_{n,p}$ ,

[TABLE]

At the first line we used that every tree that contributes to ${\bf E}\left(\sum_{T\in\mathcal{T}}\phi(T)\right)$ must satisfy $v_{0}(T)>2$ . In addition (9) states that $v_{0}(T)\geq|T|/3$ . We obtain (5) from (13).

2.2 Structure of $\Gamma_{1}^{*}$

Suppose now that $|V_{1}|=N$ and that $V_{1}$ contains $M$ edges. The construction of $\Gamma_{L}$ does not involve the edges inside $V_{1}$ , but we do know that that $\Gamma_{1}^{*}$ has minimum degree at least $3$ . The distribution of $\Gamma_{1}^{*}$ will be that of $G_{V_{1},M}$ subject to this degree condition, viz. the random graph $G_{V_{1},M}^{\delta\geq 3}$ which is sampled uniformly from the set ${\mathcal{G}}_{V_{1},M}^{\delta\geq 3}$ , the set of graphs with vertex set $V_{1}$ , $M$ edges and minimum degree at least $3$ . This is because, we can replace $\Gamma_{1}^{*}$ by any graph in $G_{V_{1},M}^{\delta\geq 3}$ without changing $\Gamma_{L}$ . By the same token, we also know that each $v\in V_{2}^{*}$ has at least $3$ random neighbors in $V_{1}$ . We have that

[TABLE]

where $\varepsilon_{1}=c^{-1/3}$ . The bound on $N$ follows from (2) and (8) and the bound on $M$ follows from the fact that in $G_{n,p}$ ,

[TABLE]

2.3 Partitioning/Coloring $G=G_{n,p}$

We will use the edge coloring argument of Fenner and Frieze [14] to verify (12). In this section we describe how to color edges.

We color most of the edges of $G$ light blue, dark blue or green. We denote the resultant blue and green subgraphs by $\Gamma_{b},\Gamma_{g}$ respectively (an edge is blue if it is either dark or light blue). We later show that the blue graph has expansion properties while the green graph has suitable randomness.

Every vertex $v\in V_{1}$ independently chooses $\min\left\{\deg_{V_{1}}(v),100\right\}$ neighbors in $V_{1}$ and we color the chosen edges light blue. Then we color every edge in $V_{2}^{*}:V_{1}$ light blue. Thereafter we independently color (re-color) every edge of $G$ dark blue with probability $1/2000$ . Finally we color green all the uncolored edges that are contained in $V_{1}$ . (Some of the edges of $G$ will remain uncolored and play no significant role in the proof.)

The above coloring satisfies the following properties:

(C1)

Every vertex in $V_{1}\cup V_{2}^{*}$ is joined to at least $3$ vertices in $V_{1}$ by a blue edge. 2. (C2)

Every dark blue edge appears independently with probability $\frac{p}{2000}$ . 3. (C3)

Given the degree sequence ${\bf d}_{g}$ of $\Gamma_{g}$ , every graph $H$ with vertex set $V_{1}$ and degree sequence ${\bf d}_{g}$ is equally likely to be $\Gamma_{g}$ .

We can justify C3 as follows: Amending $G$ by replacing $\Gamma_{g}$ by any other graph $G^{\prime}$ with vertex set $V_{1}$ and the same degree sequence and executing our construction of $S_{L}$ will result in the same set $S_{L}$ and sets $V_{1},V_{2}^{*}$ . So, each possible $G^{\prime}$ has the same set of extensions to $G_{n,p}$ and as such is equally likely.

Now given $\Gamma_{b},\Gamma_{g}\subset G$ we color the edges in $\Gamma^{*}$ as follows. Every edge in $\Gamma^{*}$ that exists in $G$ inherits its color from the coloring in $G$ . Every edge in $M^{*}\subseteq E(\Gamma^{*})$ is colored blue. We let $\Gamma_{b}^{*},\Gamma_{g}^{*}$ be the blue and the green subgraphs of $\Gamma^{*}$ . Observe that $\Gamma_{g}^{*}=\Gamma_{g}$ , hence $\Gamma_{g}^{*}$ satisfies property $(C3)$ as well.

2.4 Expansion of $\Gamma_{b}^{*}$

We wish to estimate the probability that small sets have relatively few neighbors in the graph $\Gamma_{b}^{*}$ . For $S\subseteq V^{*}=V_{1}\cup V_{2}^{*}$ we let

[TABLE]

We have slightly abused notation here since $N_{b}(S)$ is implicitly defined in both $G$ and $\Gamma^{*}$ .

It is shown in [6] and also in [7] that if $S$ is the set of endpoints created by Pósa rotations (see Section 2.6) that $S\cup N(S)$ is connected and contains at least two distinct cycles hence, at least $|S|+|N(S)|+1$ edges. Hence the condition (iii) in the following lemma.

Lemma 2.1.

W.h.p. there does not exist $S\subset V^{*}$ of size $|S|\leq n/4$ such that (i) $|N_{b}(S)|\leq 2|S|$ , (ii) $S\cup N_{b}(S)$ is connected in $\Gamma_{b}\subseteq G_{n,p}$ and (iii) $S\cup N_{b}(S)$ spans at least $|S|+|N_{b}(S)|+1$ edges in $\Gamma_{b}\subseteq G_{n,p}$ .

Proof.

Assume that the above fails for some set $S$ .

**Case 1: $|S|\leq n_{1}=n/(100c^{3})$ .

**Let $t=|N_{b}(S)|$ . We will suppose first that $S$ contains at least $s/10$ vertices of degree at least 100. In this case $S\cup N_{b}(S)$ has cardinality at most $s+t\leq 3s$ and contains at least $5s>3(s+t)/2$ edges, contradicting Lemma 1.4.

On the other hand, if there are at least $9s/10$ vertices in $S$ of degree at most 99 then there are at least $3(s+t)/10$ vertices of degree at most 99 in a connected subgraph of size $s_{0}\leq s+t\leq 3n_{1}$ . In addition that subgraph spans at least $s+t+1$ . But the probability of this occuring in $G_{n,p}$ is at most

[TABLE]

This completes the proof for Case 1.

**Case 2: $n_{1}<|S|\leq n/4$ .

**The particular values for the sets $V_{1},V_{2}^{*}$ condition $G_{n,p}$ . To get round this, we describe a larger event $\mathcal{E}_{S}$ in $G=G_{n,p}$ that (a) occurs as a consequence of there being a set $S$ with small expansion and (b) only occurs with probability $o(1)$ . This event involves an arbitrary choice for $V_{1},V_{2}^{*}$ etc.

Let $T=N_{b}(S)$ and $W=N_{G}(S)\setminus N_{b}(S)$ , that is $T$ and $W$ are the neighborhood of $S$ inside and outside of $V_{1}$ respectively. Then the following event $\mathcal{E}_{S}$ must hold. There exist $S,T,W$ such that, where $s=|S|,t=|T|$ and $w=|W|$ ,

(i)

$t\leq 2s$ . 2. (ii)

$w\leq n_{0}=ne^{-c/2}$ , where $n_{0}$ is from (8). 3. (iii)

No vertex in $S$ is connected to a vertex in $V\setminus(S\cup T\cup W)$ by a dark blue edge. 4. (iv)

$S\cup N_{b}(S)$ spans at least $s+t$ edges (at least s+t+1 in fact).

Thus,

[TABLE]

At the 5th line we used $\frac{s+t}{2s}=1+\frac{t-s}{2s}\leq\exp\left\{\frac{t-s}{2s}\right\}$ and $w\leq n_{0}\leq 100c^{3}e^{-c/2}s\leq e^{-c/3}s$ . Hence

[TABLE]

∎

2.5 The Degrees of the Green Subgraph

Lemma 2.2.

W.h.p. at least $99n/100$ vertices in $V_{1}$ have green degree at least $c/50$ . In addition every set $S\subset V_{1}$ of size at least $n/4$ has total green degree at least $cn/250$ .

Proof.

At most $100n$ edges are colored light blue and thereafter the Chernoff bounds imply that w.h.p. at most $(1+\epsilon)cn/4000$ edges are colored dark blue, for some arbitrarily small positive $\varepsilon$ . The probability that a vertex has degree less than $c/4$ in $G_{n,p}$ is bounded by $\frac{2e^{-c}\lambda^{c/4}}{c/4!}<1/1000$ . Azuma’s inequality or the Chebyshev inequality can be employed to show that w.h.p. there are at most $n/1000$ vertices of degree less than $c/4$ . Therefore every set of $n/100$ vertices is incident with at least $[(n/100-n/1000)c/4]/2$ edges. And hence with at least $[(n/100-n/1000)c/4]/2-(1+\epsilon)cn/4000-100n\geq c/50\cdot n/100$ green edges. Thus in every set of vertices of size at least $n/100$ there exists a vertex that is incident to $c/50$ green edges, proving the first part of our Lemma.

It follows that w.h.p. every set of size $n/4$ has total green degree at least

[TABLE]

∎

2.6 Pósa Rotations

We say that a path/cycle $P$ in $\Gamma^{*}$ is compatible if for every $\left\{v,w\right\}\in M^{*}$ either $P$ contains the edge $\left\{v,w\right\}$ or $V(P)\cap\left\{v,w\right\}=\emptyset$ . Our aim therefore is to show that w.h.p. $\Gamma^{*}$ contains a compatible hamilton cycle. Suppose that $\Gamma^{*}$ and hence $\Gamma_{b}^{*}$ is not Hamiltonian and that $P=(v_{1},v_{2},\ldots,v_{s})$ is a longest compatible path in both $\Gamma^{*}$ and $\Gamma_{b}^{*}$ . If $\left\{v_{s},v_{i}\right\}\in E(\Gamma^{*})$ and $v_{i}\in V_{1}$ then the path $P^{\prime}=(v_{1},v_{2},\ldots,v_{i},v_{s},v_{s-1},\ldots,v_{i+1})$ is said to be obtained from $P$ by an acceptable rotation with $v_{1}$ as the fixed endpoint. We also call $v_{i}$ the pivot vertex and the edges $\{v_{s},v_{i}\},\{v_{i},v_{i+1}\}$ the pivot edges. Observe that since $P$ is compatible and $\left\{v_{i},v_{i+1}\right\}\notin M^{*}$ (since $v_{i}\in V_{1}$ ) then $P^{\prime}$ is also compatible. Let $END_{b}^{*}(P,v_{1})$ be the set of vertices that are endpoints of paths that are obtainable from $P$ by a sequence of acceptable rotations with $v_{1}$ as the fixed endpoint. Then, for $v\in END_{b}^{*}(P,v_{1})$ we let $END_{b}^{*}(P_{v},v)$ be defined similarly. Here $P_{v}$ is a path with endpoints $v_{1},v$ obtainable from $P$ by acceptable rotations.

Arguing as in the proof of Pósa’s lemma we see that $|N_{b}(END^{*}_{b}(P,v_{1}))|\leq 2|END_{b}^{*}(P,v_{1})|$ . Indeed, assume otherwise. Then there exist vertices $v_{i},u$ such that $u\in END_{b}^{*}(P,v_{1})$ , $v_{i}\in N_{b}(u)\subseteq V_{1}$ , $v_{i-1},v_{i+1}\notin END_{b}^{*}(P,v_{1})$ and the edge $\{u,v_{i}\}$ can be used by an acceptable rotation with $v_{1}$ as the fixed endpoint that “rotates out” $u$ . Any such rotation will create a path with either $v_{i-1}$ or $v_{i+1}$ as a new endpoint, say $v_{i-1}$ . Now $v_{i}\in V_{1}$ and so the rotation will be acceptable and hence $v_{i-1}\in END_{b}^{*}(P,v_{1})$ resulting in a contradiction.

Lemma 2.3.

W.h.p. for every path $P$ of maximal length in $\Gamma_{b}^{*}$ and an endpoint $v$ of $P$ we have that $|END_{b}^{*}(P_{v},v)|\geq n/4$ .

Observe that the underlying graph in Lemma 2.1 is $\Gamma_{b}$ and so we can not apply it directly to obtain Lemma 2.3. In addition $\Gamma_{b}^{*}$ is not a subgraph of $G_{n,p}$ , since the edges in $M^{*}$ that are added correspond to paths in $G_{n,p}$ .

Proof.

We will show that $S=END_{b}^{*}(P_{v},v)$ satisfies (i), (ii) and (iii) of Lemma 2.1. For this let $R=R(P_{v},v)$ be the set of pivot points and $E_{R}=E_{R}(P)$ be the set of pivot edges. It is shown in [6] and also in [7] that if $S$ is the set of endpoints created by Pósa rotations (see Section 2.6) then $E_{R}$ spans a connected subgraph on $S\cup R$ that consists of at least $|S|+|R\setminus S|+1$ edges.

The key observation is that if $v$ is the pivot vertex of an acceptable rotation then, by definition, we have that $v\in V_{1}$ . Consequently $R\subseteq V_{1}$ (i.e $R\subseteq N_{b}(S)$ ) and every edge in $E_{R}$ belongs to $E(\Gamma_{b})\subseteq E(G_{n,p})$ . This would not have necessarily been true if $R\cap V_{2}^{*}\neq\emptyset$ . Finally, $(N_{b}(S)\setminus R):S$ spans at least $|N_{b}(S)\setminus R|$ edges in $\Gamma_{b}$ . Hence $N_{b}(S)\cup S$ is connected in $\Gamma_{b}$ and spans at least $(|S|+|R\setminus S|+1)+|N_{b}(S)\setminus R|=|S|+|N_{b}(S)|+1$ edges. This verifies conditions (ii) and (iii) of Lemma 2.1. Condition (i) is satisfied by the discussion preceeding Lemma. 2.3. ∎

From Lemma 2.3 we see that w.h.p. $|END_{b}^{*}(P_{v},v)|\geq n/4$ for all $v\in END_{b}^{*}(P,v_{1})$ . We let

[TABLE]

2.7 Coloring argument

We use a modification of a double counting argument that was first used in [14]. The specific version is from [15]. Given a two edge-colored $\Gamma^{*}$ , we choose for each $v\in V_{1}$ , an incident edge $\xi_{v}=\left\{v,\eta_{v}\right\}$ where $\eta_{v}\in V_{1}\cup V_{2}^{*}$ . We re-color $\xi_{v}$ blue if it is not already colored blue. There are at most $\Pi=\prod_{v\in V_{1}}d(v)$ choices for $\boldsymbol{\xi}=(\xi_{v},v\in V_{1})$ .

For a graph $\Gamma$ , $\Gamma=\Gamma^{*}$ or $\Gamma_{b}^{*}$ , we let $\ell(\Gamma)$ denote the length of the longest compatible path in $\Gamma$ . We indicate that $\Gamma$ has a compatible Hamilton cycle by $\ell(\Gamma)=|V(\Gamma)|$ .

We now let $a(\boldsymbol{\xi},\Gamma^{*}_{g})=1$ if the following hold:

H1

$\Gamma_{b}^{*}$ is not Hamiltonian. 2. H2

$\ell(\Gamma_{b}^{*})=\ell(\Gamma^{*})$ . 3. H3

$|N_{b}(S)|\geq 2|S|$ for all $S\subseteq V(\Gamma^{*}),|S|\leq n/4$ .

We observe first that if $\Gamma^{*}$ is not Hamiltonian and H2 holds then there exists $\boldsymbol{\xi}$ such that $a(\boldsymbol{\xi},\Gamma^{*}_{g})=1$ . Indeed, let $P=(v_{1},v_{2},\ldots,v_{r})$ be a longest path in $\Gamma^{*}$ . Then we simply let $\xi_{v_{i}}$ be the edge $\left\{v_{i},v_{i+1}\right\}$ for $1\leq i<r$ . It follows that if $\Phi$ denotes the number of choices for $\Gamma^{*}_{g}$ and $\pi_{\bar{H}}$ is the probability that $\Gamma^{*}$ is not Hamiltonian, then

[TABLE]

where the $o(1)$ term accounts for failure of the high probability events that we have identified so far.

On the other hand, we have as stated in (C3) above, that $\Gamma^{*}_{g}$ is distributed as a random graph chosen uniformly from graphs with degree sequence $D^{*}_{g}$ . Hence

[TABLE]

where $\pi_{b}$ is defined as follows: let $P$ be some longest path in $\Gamma^{*}_{b}$ . Then $\pi_{g}$ is the probability that a random realization of $\Gamma_{g}^{*}$ does not include a pair $\left\{x,y\right\}$ where $y\in END_{b}^{*}(P,x)$ . We will argue below that

[TABLE]

Lemma 2.2 implies that at least $n/4-n/100$ out of the at least $n/4$ vertices in $END_{b}^{*}(P)$ have $d_{\Gamma^{*}_{g}}(v)\geq c/50$ . Also, for such $v$ the set $END_{b}^{*}(P_{v},v)\cup\{v\}$ is of size at least $n/4$ and so has total degree at least $cn/250$ . Thus from (18), it follows that

[TABLE]

The Arithmetic-Geometric-mean inequality implies that

[TABLE]

It then follows that for sufficiently large $c$

[TABLE]

and this completes the proof of (12).

Proof of (17): This is an exercise in the use of the configuration model of Bollobás [5]. Let $W=[2M_{g}]$ where $M_{g}$ is the number of green edges and let $W_{1},W_{2},\ldots,W_{N}$ be a partition of $W$ where $|W_{v}|=d_{\Gamma^{*}_{g}}(v),v\in V_{1}$ . The elements of $W$ will be referred to as configuration points or just as points. A configuration $F$ is a partition of $W$ into $M_{g}$ pairs. Next define $\psi:W\to[N]$ by $x\in W_{\psi(x)}$ . Given $F$ , we let $\gamma(F)$ denote the (muti)graph with vertex set $V_{1}$ and an edge $\left\{\psi(x),\psi(y)\right\}$ for all $\left\{x,y\right\}\in F$ . We say that $\gamma(F)$ is simple if it has no loops or multiple edges. Suppose that we choose $F$ at random. The properties of $F$ that we need are

P1

If $G_{1},G_{2}\in{\mathcal{G}}_{{\bf d}_{g}}$ then $\operatorname{\bf Pr}(\gamma(F)=G_{1}\mid\gamma(F)\text{ is simple})=\operatorname{\bf Pr}(\gamma(F)=G_{2}\mid\gamma(F)\text{ is simple})$ . 2. P2

$\operatorname{\bf Pr}(\gamma(F)\text{ is simple})=\Omega(1)$ .

These are well established properties of the configuration model, see for example Chapter 11 of [19]. Note that P2 uses the fact that w.h.p. $G_{V_{1},M}^{\delta\geq 3}$ (and hence $\Gamma_{g}^{*}$ ) has an exponential tail, as shown for example in [17]. Given all this, in the context of the configuration model, (17) is a simple consequence of a random pairing of $W$ . The $O(1)$ factor is $1/\operatorname{\bf Pr}(\gamma(F)\text{ is simple})$ and bounds the effect of the conditioning. We take the square root to account for the possibility that $w\in END_{b}^{*}(P_{v},v)$ and $v\in END_{b}^{*}(P_{w},w)$ .

3 Proof of Theorem 1.3

For $v\in C_{2}$ we let $\phi(v)=\phi(T)/|\upsilon_{0}(T)|$ if $v\in\upsilon_{0}(T)$ for some $T\in\mathcal{T}$ and $\phi(v)=0$ otherwise. (Recall that $\upsilon_{0}(T)=V(T)\setminus V_{2}$ .) Thus

[TABLE]

Hence (1) can be rewritten as,

[TABLE]

Let $k_{1}=k_{1}(\epsilon,c)$ be the smallest positive integer such that

[TABLE]

Note that for large $c$ , we have

[TABLE]

For $v\in C_{2}$ let $G_{v}$ be the graph consisting of (i) the vertices of $G$ that are within distance $k_{1}$ from $v$ and (ii) a copy of $K_{3,3}$ where every vertex in the $k_{1}$ neighborhood of $v$ is adjacent to each vertex of the same one part of the bipartition. We consider the algorithm for the construction of $\Gamma_{L}$ on $G_{v}$ and let $C_{2,v},\Gamma_{v},V_{1,v},V_{2,v},S_{L,v},\upsilon_{0,v}(T)$ be the corresponding sets/quantities.

For a tree $T\in S_{L,v}$ let $f(T)$ be equal to $|T|$ minus the maximum number of vertices that can be covered by a set of vertex disjoint paths with endpoints in $V_{2,v}$ (we allow paths of length 0). For $v\in C_{2}$ , if $v$ belongs to some tree $T\in S_{L,v}$ set $f(v)=f(T)/\upsilon_{0,v}(T)$ , otherwise set $f(v)=0$ .

For $v\in C_{2}$ let $t(v)=1$ if $v\in V_{1}$ or if $v\in S_{L}$ and in $\Gamma_{L}$ , $v$ lies in a component with at most $k_{1}-2$ vertices that are not connected to $V_{1}$ in $G$ . Set $t(v)=0$ otherwise. Observe that if $t(v)=1$ then $\phi(v)=f(v)$ . Otherwise $|\phi(v)-f(v)|\leq 1$ .

By repeating the arguments used to prove (1.1) and (9) it follows that if $t(v)=0$ then $v$ lies on a component $C$ of size at most $\log n$ . In addition at least $|V(C)|/3$ vertices in $V(C)$ are not adjacent to any vertex outside $V(C)$ . Thus the expected number of vertices $v$ satisfying $t(v)=0$ is bounded by

[TABLE]

A vertex $v\in[n]$ is good if the $i$ th level of its BFS neighborhood has size at most $3c^{i}k_{1}/\epsilon$ for every $i\leq k_{1}$ and it is bad otherwise. Because the expected size of the $i^{th}$ neighborhood is $\approx c^{i}$ we have by the Markov inequality that $v$ is bad with probability at most $\approx\varepsilon/3k_{1}$ and so the expected number of bad vertices is bounded by $\varepsilon n/2$ . Thus

[TABLE]

Let $\mathcal{H}_{\varepsilon}$ be the set of BFS neighborhoods that are good i.e. whose $i$ th levels are of size at most $3c^{i}k_{1}/\epsilon$ for every $i\leq k_{1}$ . Every element of $\mathcal{H}_{\varepsilon}$ corresponds to a pair $(H,o_{H})$ where $H$ is a graph and $o$ is a distinguished vertex of $H$ , that is considered to be the root. Also for $v\in C_{2}$ let $G(N_{k_{1}}(v))$ be the subgraph induced by the ${k_{1}}^{th}$ neighborhood of $v$ . For $(H,o_{H})\in\mathcal{H}_{\varepsilon}$ let $int(H)$ be the set of vertices incident to the first $k_{1}-1$ neighborhoods of $o_{H}$ and let $Aut(H,o_{H})$ be the number of automorphisms of $H$ that fix $o_{H}$ . Note that each good vertex $v$ is associated with a pair $(H,o_{H})\in{\mathcal{H}}_{\varepsilon}$ from which we can compute $f(v)$ , since $f(v)=f(o_{H})$ . Thus, if now $M=|E(C_{2})|,N=|C_{2}|$ ,

[TABLE]

where $\rho_{H,\sigma_{H}}$ is the probability $(G(N_{k_{1}}(v)),v)=(H,o_{H})$ in $C_{2}$ . We show in Section 3.1 that

[TABLE]

where $f_{k}$ is defined in (25) below and $\lambda$ satisfies (26) below.

Finally observe that with the exception of the $o(1)$ term, all the terms in (21) are independent of $n$ . We let

[TABLE]

Then for a fixed $c$ , we see that $f_{\varepsilon}(c)$ is monotone increasing as $\varepsilon\to 0$ . This is simply because ${\mathcal{H}}_{\varepsilon}$ grows. Furthermore, $f_{\varepsilon}(c)\leq 1$ and so the limit $f(c)=\lim_{\varepsilon\to 0}f_{\varepsilon}(c)$ exists. This verifies part (a) of Theorem 1.3. For part (b), we prove, (see (36)),

Lemma 3.1.

[TABLE]

Proof.

To prove this we show that if $\nu(H)$ is the number of copies of $H$ in $C_{2}$ then $H\in{\mathcal{H}}_{\varepsilon}$ implies that

[TABLE]

The inequality follows from a version of Azuma’s inequality (see (36)), and the lemma follows from taking a union bound over

[TABLE]

graphs $H$ . Note also that the $o(n)$ term in (21) is bounded by the same $e^{O((1/\varepsilon)^{2+2\log c/c})}$ term times the number of cycles of length at most $2k_{1}$ in $G$ . The probability that this exceeds $n^{1/2}$ is certainly at most the RHS of (24). We will give details of our use of the Azuma inequality in Section 3.1. ∎

Part (b) of Theorem 1.3 follows by letting $\varepsilon\to 0$ and from the Borel-Cantelli lemma.

3.1 A Model of $C_{2}$

It is known that given $M,N$ that, up to relabeling vetices, $C_{2}$ is distributed as $G_{N,M}^{\delta\geq 2}$ . The random graph $G_{N,M}^{\delta\geq 2}$ is chosen uniformly from ${\mathcal{G}}_{N,M}^{\delta\geq 2}$ which is the set of graphs with vertex set $[N]$ , $M$ edges and minimum degree at least two.

3.1.1 Random Sequence Model

We must now take some time to explain the model we use for $G_{N,M}^{\delta\geq 2}$ . We use a variation on the pseudo-graph model of Bollobás and Frieze [9] and Chvátal [10]. Given a sequence ${\bf x}=(x_{1},x_{2},\ldots,x_{2M})\in[n]^{2M}$ of $2M$ integers between 1 and $N$ we can define a (multi)-graph $G_{{\bf x}}=G_{\bf x}(N,M)$ with vertex set $[N]$ and edge set $\{(x_{2i-1},x_{2i}):1\leq i\leq M\}$ . The degree $d_{\bf x}(v)$ of $v\in[N]$ is given by

[TABLE]

If ${\bf x}$ is chosen randomly from $[N]^{2M}$ then $G_{{\bf x}}$ is close in distribution to $G_{N,M}$ . Indeed, conditional on being simple, $G_{{\bf x}}$ is distributed as $G_{N,M}$ . To see this, note that if $G_{{\bf x}}$ is simple then it has vertex set $[N]$ and $M$ edges. Also, there are $M!2^{M}$ distinct equally likely values of ${\bf x}$ which yield the same graph.

Our situation is complicated by there being a lower bound of 2 on the minimum degree. So we let

[TABLE]

Let $G_{\bf x}$ be the multi-graph $G_{\bf x}$ for ${\bf x}$ chosen uniformly from $[N]^{2M}_{\delta\geq 2}$ . It is clear then that conditional on being simple, $G_{\bf x}$ has the same distribution as $G_{N,M}^{\delta\geq 2}$ . It is important therefore to estimate the probability that this graph is simple. For this and other reasons, we need to have an understanding of the degree sequence $d_{\bf x}$ when ${\bf x}$ is drawn uniformly from $[N]^{2M}_{\delta\geq 2}$ . Let

[TABLE]

for $k\geq 0$ .

Lemma 3.2.

Let ${\bf x}$ be chosen randomly from $[N]^{2M}_{\delta\geq 2}$ . Let $Z_{j},j=1,2,\ldots,N$ be independent copies of a truncated Poisson random variable $\mathcal{P}$ , where

[TABLE]

Here ${\lambda}$ satisfies

[TABLE]

Then $\{d_{\bf x}(j)\}_{j\in[N]}$ is distributed as $\{Z_{j}\}_{j\in[N]}$ conditional on $Z=\sum_{j\in[n]}Z_{j}=2M$ .

Proof.

This can be derived as in Lemma 4 of [2]. ∎

It follows from (14) and (26) and the fact that $f_{1}({\lambda})/f_{2}({\lambda})\to 1$ as $c\to\infty$ that for large $c$ ,

[TABLE]

We note that the variance $\sigma^{2}$ of $\mathcal{P}$ is given by

[TABLE]

Furthermore,

[TABLE]

This is an example of a local central limit theorem. See for example, (5) of [2] or (3) of [17]. It follows by repeated application of (28) and (29) that if $k=O(1)$ and $d_{1}^{2}+\cdots+d_{k}^{2}=o(N)$ then

[TABLE]

Let $\nu_{\bf x}(s)$ denote the number of vertices of degree $s$ in $G_{\bf x}$ .

Lemma 3.3.

Suppose that $\log N=O((N{\lambda})^{1/2})$ . Let ${\bf x}$ be chosen randomly from $[N]^{2M}_{\delta\geq 2}$ . Then as in equation (7) of [2], we have that with probability $1-o(N^{-10})$ ,

[TABLE]

We can now show $G_{\bf x}$ , ${\bf x}\in[n]^{2m}_{\delta\geq 2}$ is a good model for $G_{n,m}^{\delta\geq 2}$ . For this we only need to show now that

[TABLE]

Again, this follows as in [2].

Given a tree $H$ with $k$ vertices of degrees $z_{1},z_{2},...,z_{k}$ and a fixed vertex $v$ we see that if $\rho_{H}$ is the probability that $G(N_{k_{1}}(v))=H$ in $G_{{\bf x}}$ then we have

[TABLE]

Explanation for (34): We use (30) to obtain the probability that the degrees of $[k]$ are $d_{1},\ldots,d_{k}$ . This explains the product $\prod_{i=1}^{k}\frac{\lambda^{d_{i}}}{d_{i}!f_{2}(\lambda)}$ . Implicit here is that $d_{i}=O(\log n)$ , from (32). The contribution to the degree sum $D$ for $D\geq 2k\log n$ can therefore be shown to be negligible. We use the fact that $k$ is small to argue that w.h.p. $H$ is induced. We choose the vertices, other than $v$ in $\binom{N}{k-1}$ ways and then $\frac{(k-1)!}{Aut(H,o_{H})}$ counts the number of copies of $H$ in $K_{k}$ . We then choose the place in the sequence to put these edges in $\binom{M}{k-1}2^{k-1}(k-1)!$ ways. Finally note that the probability the $z_{i}$ occurrences of the $i$ th vertex are as claimed is asymptotically equal to $\frac{d_{i}(d_{i}-1)\cdots(d_{i}-z_{i}+1)}{(2M)^{z_{i}}}$ and this explains the factor $\prod_{i=1}^{k}\frac{d_{i}!}{(d_{i}-z_{i})!}\frac{1}{(2M)^{2k-2}}$ .

Explanation for (35): We use the identity

[TABLE]

It only remains to verify (24). It follows from the above that ${\bf E}(\nu(H)\mid M,N)=\Omega(N)$ . We first condition on a degree sequence x satisfying (31). We then work in the associated configuration model. We can generate a configuration $F$ as a permutation of the multi-set $\left\{d_{i}\times i:i\in[N]\right\}$ . Interchanging two elements in a permutation can only change $\nu(H)$ by $O(1)$ . We can therefore apply Azuma’s inequality to show that

[TABLE]

(Specifically we can use Lemma 11 of Frieze and Pittel [21] or Section 3.2 of McDiarmid [23].) This verifies (24).

4 Summary and open problems

We have derived an expression for the length of the longest path in $G_{n,p}$ that holds for large $c$ w.h.p. It would be interesting to have a more algebraic expression. Also, we could no doubt make this proof algorithmic, by using the arguments of Frieze and Haber [18]. It would be more interesting to do the analysis for small $c>1$ . Applying the coupling of McDiarmid [22] we see that the random digraph $D_{n,p},p=c/n$ contains a path at least as long as that given by the R.H.S. of (6). It should be possible to improve this, just as Krivelevich, Lubetzky and Sudakov [20] did for the earlier result of [16].

Bibliography25

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] M. Ajtai, J. Komlós and E. Szemerédi. The longest path in a random graph, Combinatorica 1 (1981) 1-12.
2[2] J. Aronson, A.M. Frieze and B.G. Pittel, Maximum matchings in sparse random graphs: Karp-Sipser re-visited, Random Structures and Algorithms 12 (1998) 111-178.
3[3] M. Bayati, D. Gamarnik and P. Tetali, Combinatorial approach to the interpolation method and scaling limits in sparse random graphs, The Annals of Probability 41 (2013) 4080-4115.
4[4] B. Bollobás, Long paths in sparse random graphs, Combinatorica 2 (1982) 223-228.
5[5] B. Bollobás, A probabilistic proof of an asymptotic formula for the number of labeled regular graphs , European Journal on Combinatorics 1 (1980) 311-316.
6[6] B. Bollobás, C. Cooper, T.I.Fenner and A.M.Frieze, On Hamilton cycles in sparse random graphs with minimum degree at least k 𝑘 k , Journal of Graph Theory 34 (2000) 42-59.
7[7] A.M.Frieze and B. Pittel. On a sparse random graph with minimum degree three: Likely Posa’s sets are large, Journal of Combinatorics 4 (2013) 123-156. [Co-author: ]
8[8] B.Bollobás, T.I.Fenner and A.M.Frieze, Long cycles in sparse random graphs, Graph theory and combinatorics, Proceedings of Cambridge Combinatorial Conference in honour of Paul Erdos (1984) 59-64.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

A scaling limit for the length of the longest cycle in a sparse random graph

Abstract

1 Introduction

Theorem 1.1**.**

Corollary 1.2**.**

Theorem 1.3**.**

1.1 Structure of ΓL\Gamma_{L}ΓL​:

Lemma 1.4**.**

Proof.

2 Proof of Theorem 1.1

2.1 Proof of (5)

2.2 Structure of Γ1∗\Gamma_{1}^{*}Γ1∗​

2.3 Partitioning/Coloring G=Gn,pG=G_{n,p}G=Gn,p​

2.4 Expansion of Γb∗\Gamma_{b}^{*}Γb∗​

Lemma 2.1**.**

Proof.

2.5 The Degrees of the Green Subgraph

Lemma 2.2**.**

Proof.

2.6 Pósa Rotations

Lemma 2.3**.**

Proof.

2.7 Coloring argument

3 Proof of Theorem 1.3

Lemma 3.1**.**

Proof.

3.1 A Model of C2C_{2}C2​

3.1.1 Random Sequence Model

Lemma 3.2**.**

Proof.

Lemma 3.3**.**

4 Summary and open problems

Theorem 1.1.

Corollary 1.2.

Theorem 1.3.

1.1 Structure of $\Gamma_{L}$ :

Lemma 1.4.

2.2 Structure of $\Gamma_{1}^{*}$

2.3 Partitioning/Coloring $G=G_{n,p}$

2.4 Expansion of $\Gamma_{b}^{*}$

Lemma 2.1.

Lemma 2.2.

Lemma 2.3.

Lemma 3.1.

3.1 A Model of $C_{2}$

Lemma 3.2.

Lemma 3.3.