A scaling limit for the length of the longest cycle in a sparse random graph
Michael Anastos, Alan Frieze

TL;DR
This paper investigates the asymptotic behavior of the longest cycle in sparse random graphs, establishing a limiting function for its normalized length as the graph size grows, especially for large average degrees.
Contribution
It introduces a new limiting function for the longest cycle length in sparse random graphs and provides explicit formulas for initial polynomial coefficients.
Findings
Longest cycle length converges to a function f(c) of the average degree c.
For large c, the normalized longest cycle length approaches f(c).
The same asymptotic applies to the longest path in the graph.
Abstract
We discuss the length of the longest cycle in a sparse random graph . constant. We show that for large there is a function such that a.s. The function where is a polynomial in . We are only able to explicitly give the values , although we could in principle compute any . We see immediately that the length of the longest path is also asymptotic to w.h.p.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
A scaling limit for the length of the longest cycle in a sparse random graph
Michael Anastos and Alan Frieze Department of Mathematics and Computer Science, Freie Universität Berlin, Berlin, Germany, email:[email protected]Department of Mathematical Sciences, Carnegie Mellon University, Pittsburgh PA, U.S.A. email:[email protected]; the author is supported in part by NSF Grant DMS1363136
Abstract
We discuss the length of the longest cycle in a sparse random graph . constant. We show that for large there exists a function such that a.s. The function where is a polynomial in . We are only able to explicitly give the values , although we could in principle compute any . We see immediately that the length of the longest path is also asymptotic to w.h.p.
1 Introduction
There are several basic questions that can be asked in the context of a class of graphs. E.g. what is the chromatic number? Is the graph Hamiltonian? Another such basic question is the following: how long is the longest cycle? In this paper we study this question in relation to the sparse random graph for a constant . Thus, let denote the length of the longest cycle in the random graph . Erdős [11] conjectured that if then w.h.p. where is independent of . This was proved by Ajtai, Komlós and Szemerédi [1] and in a slightly weaker form by de la Vega [25] who proved that if then . See also Suen [24]. Although this answered Erdős’s question it only gives us a lower bound for the length of the longest cycle. Bollobás [4] realised that for large one could find a large path/cycle w.h.p. by concentrating on a large subgraph with large minimum degree and demonstrating Hamiltonicity. In this way he showed that . This was then improved by Bollobás, Fenner and Frieze [8] to and then by Frieze [16] to where as . This last result is optimal up to the value of , as there are w.h.p. vertices of degree 0 or 1.
The basic open question to this point, is at to whether or not there exists a function such that w.h.p. the where as . And what is . In this paper we establish the existence of for large and give a method of computing it to arbitrary accuracy. We note that this is one case of a fundamental extremal random variable where the existence of a scaling limit has not previously been shown to exist and does not appear to be susceptible to the interpolation method as in Bayati, Gamarnik and Tetali [3].
Let and let . We will assume throughout that is sufficiently large. Let denote the 2-core of . By this we mean that part of the giant component consisting of vertices that are in at least one cycle. The longest cycle in is contained in and the length of the longest path in differs from this by w.h.p. This will be for two reasons. The first reason is that we will establish a Hamiltonian subgraph of that contains the longest path in and the second reason for this is that w.h.p. the giant component of consists of plus a forest of trees with maximum diameter .
As in the papers, [4], [8] and [16] we consider a process that builds a large Hamiltonian subgraph. We construct a sequence of sets and their induced subgraphs . Suppose now that we have constructed , . We construct from via one of two cases:
**Construction of
Case a:** If there is that has exactly one or two neighbors in , then we add to to make .
Case b: If there is a vertex that has at most two neighbors in then we define to be plus plus the neighbors of in .
is the set we end up with when there are no more vertices to add. We note that is well-defined and does not depend on the order of adding vertices. Indeed, suppose we have two distinct outcomes and . Assume without loss of generality that there exists which is the smallest index such that . Then, . If was added in Step a as the neighbor of then and has at most two neighbors in . This contradicts the fact that . Suppose then that is added in Step b. If then it has at most two neighbors in and hence it has at most two neighbors in . This contradicts the fact that . If is the neighbor of then we get the same contradiction. It follows that and vice-versa, by the same reasoning.
We will argue below in Section 1.1 that w.h.p. the graph induced by is a forest plus a few small components. Each tree in will w.h.p. have at most vertices. For a tree component let
[TABLE]
Notation 1: Let denote the set of trees in . For a tree let be the set of vertex disjoint path packings of where we allow only paths whose start- and end- vertex are have neighbors in . Here we allow paths of length 0, so that a single vertex with neighbors in counts as a path. For let be the number of vertices in that are not covered by . Let and denote a set of paths that leaves vertices of uncovered i.e. satisfies .
If then we write if as .
We will prove
Theorem 1.1**.**
Let where is a sufficiently large constant. Then w.h.p.
[TABLE]
The size of is well-known. Let be the unique solution of in . Then w.h.p. (see e.g. [19], Lemma 2.16),
[TABLE]
Equation (4.5) of Erdős and Rényi [12] tells us that
[TABLE]
We will argue below that w.h.p., as grows, that
[TABLE]
We therefore have the following improvement to the estimate in [16].
Corollary 1.2**.**
W.h.p., as grows, that
[TABLE]
Note the term which accounts for vertices of degree 0 or 1. In principle we can compute more terms than what is given in (6). We claim next that there exists some function such that the sum in (1) is concentrated around . In other words, the sum in (1) has the form w.h.p.
Theorem 1.3**.**
- (a)
There exists a function such that for any , there exists such that for ,
[TABLE] 2. (b)
[TABLE]
We will prove Theorem 1.3 in Section 3.
1.1 Structure of :
We first bound the size of . We need the following lemma on the density of small sets.
Lemma 1.4**.**
W.h.p., every set of size at most contains less than edges in .
Proof.
The expected number of sets invalidating the claim can be bounded by
[TABLE]
∎
Now consider the construction of . Let be the set of the vertices with degree less than and let . If we start with and run the process for constructing then we will producee the same as if we had started with . This is because, as we have shown, the order of adding vertices does not matter. Now w.h.p. there are at most vertices of degree at most in , (see for example Theorem 3.3 of [19]) and so .
Now suppose that the process runs for another rounds. Then has a least edges and at most vertices. This is because round adds at most three new vertices to and the vertices that take the role of have degree at least and all of their neighbors will be in . If reaches then
[TABLE]
So, by Lemma 1.4, we can assert that w.h.p. the process runs for less than rounds and,
[TABLE]
We note the following properties of . Let
[TABLE]
Then,
- G1
Each vertex has no neighbors in . 2. G2
Each has at least neighbors in .
Given the definition of , for we can express as
[TABLE]
We will now show that w.h.p. each component of satisfies
[TABLE]
We will prove that for and each component spanned by ,
[TABLE]
Here is taken to be the number of vertices in with no neigbors in . Taking in (10) yields (9). We proceed by an induction on .
and so for , (10) is satisfied by every component spanned by . Suppose that at step , (10) is satisfied by every component spanned by .
At step , assume that invokes either Case a or Case b. In both cases S_{\ell+1}=S_{\ell}\cup\big{(}\{v\}\cup N(v)\big{)}. The addition of the new vertices into could merge components into one component while adding at most vertices. Hence . In addition every vertex that contributed to , now contributes towards . Also has neighbors outside but no neigbors outside . The inductive hypothesis implies that for . Thus,
[TABLE]
and so (10) continues to hold for all the components spanned by .
We next show that w.h.p., only a small component can satisfy (9). We consider in the context of in which case will have at least vertices with no neighbors outside . So, the expected number of components of size that satisfy this condition is at most
[TABLE]
if is large and .
So, we can assume that all components are of size at most . Then the expected number of vertices on components that are not trees is bounded by
[TABLE]
Markov’s inequality implies that w.h.p. such components span at most vertices.
Notation 2: For , let be the matching on obtained by replacing each path of of length at least 1 by an edge and let . Let denote the internal vertices of the paths and and . We let be the subgraph of induced by . We also let be the bipartite graph with vertex partition and all edges . Finally let and .
2 Proof of Theorem 1.1
The RHS of (1), modulo the number of vertices that are spanned by non tree components in , is clearly an upper bound on the largest cycle in . Any cycle must omit at least vertices from each . On the other hand, as we show, w.h.p. there is cycle that spans (see Notation 1). The length of is equal to the RHS of (1). Equivalently, we show that
[TABLE]
2.1 Proof of (5)
We are not able at this time to give a simple estimate of as a function of . We will have to make do with (5). On the other hand, can be approximated to within arbitrary accuracy, using the argument in Section 3.
We work in . Observe that must have a vertex of degree three in order that . The smallest such tree has seven vertices and consists of three paths of length two with a common endpoint. (If is a star of degree 3 for example, it can be covered by a path of length 2 that covers the central vertex and a path of length 0. Here we are using that every vertex in must have degree at least 2, hence every vertex of of degree 1 belongs to and has neighbors in .) Therefore, in ,
[TABLE]
At the first line we used that every tree that contributes to must satisfy . In addition (9) states that . We obtain (5) from (13).
2.2 Structure of
Suppose now that and that contains edges. The construction of does not involve the edges inside , but we do know that that has minimum degree at least . The distribution of will be that of subject to this degree condition, viz. the random graph which is sampled uniformly from the set , the set of graphs with vertex set , edges and minimum degree at least . This is because, we can replace by any graph in without changing . By the same token, we also know that each has at least random neighbors in . We have that
[TABLE]
where . The bound on follows from (2) and (8) and the bound on follows from the fact that in ,
[TABLE]
2.3 Partitioning/Coloring
We will use the edge coloring argument of Fenner and Frieze [14] to verify (12). In this section we describe how to color edges.
We color most of the edges of light blue, dark blue or green. We denote the resultant blue and green subgraphs by respectively (an edge is blue if it is either dark or light blue). We later show that the blue graph has expansion properties while the green graph has suitable randomness.
Every vertex independently chooses neighbors in and we color the chosen edges light blue. Then we color every edge in light blue. Thereafter we independently color (re-color) every edge of dark blue with probability . Finally we color green all the uncolored edges that are contained in . (Some of the edges of will remain uncolored and play no significant role in the proof.)
The above coloring satisfies the following properties:
- (C1)
Every vertex in is joined to at least vertices in by a blue edge. 2. (C2)
Every dark blue edge appears independently with probability . 3. (C3)
Given the degree sequence of , every graph with vertex set and degree sequence is equally likely to be .
We can justify C3 as follows: Amending by replacing by any other graph with vertex set and the same degree sequence and executing our construction of will result in the same set and sets . So, each possible has the same set of extensions to and as such is equally likely.
Now given we color the edges in as follows. Every edge in that exists in inherits its color from the coloring in . Every edge in is colored blue. We let be the blue and the green subgraphs of . Observe that , hence satisfies property as well.
2.4 Expansion of
We wish to estimate the probability that small sets have relatively few neighbors in the graph . For we let
[TABLE]
We have slightly abused notation here since is implicitly defined in both and .
It is shown in [6] and also in [7] that if is the set of endpoints created by Pósa rotations (see Section 2.6) that is connected and contains at least two distinct cycles hence, at least edges. Hence the condition (iii) in the following lemma.
Lemma 2.1**.**
W.h.p. there does not exist of size such that (i) , (ii) is connected in and (iii) spans at least edges in .
Proof.
Assume that the above fails for some set .
**Case 1: .
**Let . We will suppose first that contains at least vertices of degree at least 100. In this case has cardinality at most and contains at least edges, contradicting Lemma 1.4.
On the other hand, if there are at least vertices in of degree at most 99 then there are at least vertices of degree at most 99 in a connected subgraph of size . In addition that subgraph spans at least . But the probability of this occuring in is at most
[TABLE]
This completes the proof for Case 1.
**Case 2: .
**The particular values for the sets condition . To get round this, we describe a larger event in that (a) occurs as a consequence of there being a set with small expansion and (b) only occurs with probability . This event involves an arbitrary choice for etc.
Let and , that is and are the neighborhood of inside and outside of respectively. Then the following event must hold. There exist such that, where and ,
- (i)
. 2. (ii)
, where is from (8). 3. (iii)
No vertex in is connected to a vertex in by a dark blue edge. 4. (iv)
spans at least edges (at least s+t+1 in fact).
Thus,
[TABLE]
At the 5th line we used and . Hence
[TABLE]
∎
2.5 The Degrees of the Green Subgraph
Lemma 2.2**.**
W.h.p. at least vertices in have green degree at least . In addition every set of size at least has total green degree at least .
Proof.
At most edges are colored light blue and thereafter the Chernoff bounds imply that w.h.p. at most edges are colored dark blue, for some arbitrarily small positive . The probability that a vertex has degree less than in is bounded by . Azuma’s inequality or the Chebyshev inequality can be employed to show that w.h.p. there are at most vertices of degree less than . Therefore every set of vertices is incident with at least edges. And hence with at least green edges. Thus in every set of vertices of size at least there exists a vertex that is incident to green edges, proving the first part of our Lemma.
It follows that w.h.p. every set of size has total green degree at least
[TABLE]
∎
2.6 Pósa Rotations
We say that a path/cycle in is compatible if for every either contains the edge or . Our aim therefore is to show that w.h.p. contains a compatible hamilton cycle. Suppose that and hence is not Hamiltonian and that is a longest compatible path in both and . If and then the path is said to be obtained from by an acceptable rotation with as the fixed endpoint. We also call the pivot vertex and the edges the pivot edges. Observe that since is compatible and (since ) then is also compatible. Let be the set of vertices that are endpoints of paths that are obtainable from by a sequence of acceptable rotations with as the fixed endpoint. Then, for we let be defined similarly. Here is a path with endpoints obtainable from by acceptable rotations.
Arguing as in the proof of Pósa’s lemma we see that . Indeed, assume otherwise. Then there exist vertices such that , , and the edge can be used by an acceptable rotation with as the fixed endpoint that “rotates out” . Any such rotation will create a path with either or as a new endpoint, say . Now and so the rotation will be acceptable and hence resulting in a contradiction.
Lemma 2.3**.**
W.h.p. for every path of maximal length in and an endpoint of we have that .
Observe that the underlying graph in Lemma 2.1 is and so we can not apply it directly to obtain Lemma 2.3. In addition is not a subgraph of , since the edges in that are added correspond to paths in .
Proof.
We will show that satisfies (i), (ii) and (iii) of Lemma 2.1. For this let be the set of pivot points and be the set of pivot edges. It is shown in [6] and also in [7] that if is the set of endpoints created by Pósa rotations (see Section 2.6) then spans a connected subgraph on that consists of at least edges.
The key observation is that if is the pivot vertex of an acceptable rotation then, by definition, we have that . Consequently (i.e ) and every edge in belongs to . This would not have necessarily been true if . Finally, spans at least edges in . Hence is connected in and spans at least edges. This verifies conditions (ii) and (iii) of Lemma 2.1. Condition (i) is satisfied by the discussion preceeding Lemma. 2.3. ∎
From Lemma 2.3 we see that w.h.p. for all . We let
[TABLE]
2.7 Coloring argument
We use a modification of a double counting argument that was first used in [14]. The specific version is from [15]. Given a two edge-colored , we choose for each , an incident edge where . We re-color blue if it is not already colored blue. There are at most choices for .
For a graph , or , we let denote the length of the longest compatible path in . We indicate that has a compatible Hamilton cycle by .
We now let if the following hold:
- H1
is not Hamiltonian. 2. H2
. 3. H3
for all .
We observe first that if is not Hamiltonian and H2 holds then there exists such that . Indeed, let be a longest path in . Then we simply let be the edge for . It follows that if denotes the number of choices for and is the probability that is not Hamiltonian, then
[TABLE]
where the term accounts for failure of the high probability events that we have identified so far.
On the other hand, we have as stated in (C3) above, that is distributed as a random graph chosen uniformly from graphs with degree sequence . Hence
[TABLE]
where is defined as follows: let be some longest path in . Then is the probability that a random realization of does not include a pair where . We will argue below that
[TABLE]
Lemma 2.2 implies that at least out of the at least vertices in have . Also, for such the set is of size at least and so has total degree at least . Thus from (18), it follows that
[TABLE]
The Arithmetic-Geometric-mean inequality implies that
[TABLE]
It then follows that for sufficiently large
[TABLE]
and this completes the proof of (12).
Proof of (17): This is an exercise in the use of the configuration model of Bollobás [5]. Let where is the number of green edges and let be a partition of where . The elements of will be referred to as configuration points or just as points. A configuration is a partition of into pairs. Next define by . Given , we let denote the (muti)graph with vertex set and an edge for all . We say that is simple if it has no loops or multiple edges. Suppose that we choose at random. The properties of that we need are
- P1
If then . 2. P2
.
These are well established properties of the configuration model, see for example Chapter 11 of [19]. Note that P2 uses the fact that w.h.p. (and hence ) has an exponential tail, as shown for example in [17]. Given all this, in the context of the configuration model, (17) is a simple consequence of a random pairing of . The factor is and bounds the effect of the conditioning. We take the square root to account for the possibility that and .
3 Proof of Theorem 1.3
For we let if for some and otherwise. (Recall that .) Thus
[TABLE]
Hence (1) can be rewritten as,
[TABLE]
Let be the smallest positive integer such that
[TABLE]
Note that for large , we have
[TABLE]
For let be the graph consisting of (i) the vertices of that are within distance from and (ii) a copy of where every vertex in the neighborhood of is adjacent to each vertex of the same one part of the bipartition. We consider the algorithm for the construction of on and let be the corresponding sets/quantities.
For a tree let be equal to minus the maximum number of vertices that can be covered by a set of vertex disjoint paths with endpoints in (we allow paths of length 0). For , if belongs to some tree set , otherwise set .
For let if or if and in , lies in a component with at most vertices that are not connected to in . Set otherwise. Observe that if then . Otherwise .
By repeating the arguments used to prove (1.1) and (9) it follows that if then lies on a component of size at most . In addition at least vertices in are not adjacent to any vertex outside . Thus the expected number of vertices satisfying is bounded by
[TABLE]
A vertex is good if the th level of its BFS neighborhood has size at most for every and it is bad otherwise. Because the expected size of the neighborhood is we have by the Markov inequality that is bad with probability at most and so the expected number of bad vertices is bounded by . Thus
[TABLE]
Let be the set of BFS neighborhoods that are good i.e. whose th levels are of size at most for every . Every element of corresponds to a pair where is a graph and is a distinguished vertex of , that is considered to be the root. Also for let be the subgraph induced by the neighborhood of . For let be the set of vertices incident to the first neighborhoods of and let be the number of automorphisms of that fix . Note that each good vertex is associated with a pair from which we can compute , since . Thus, if now ,
[TABLE]
where is the probability in . We show in Section 3.1 that
[TABLE]
where is defined in (25) below and satisfies (26) below.
Finally observe that with the exception of the term, all the terms in (21) are independent of . We let
[TABLE]
Then for a fixed , we see that is monotone increasing as . This is simply because grows. Furthermore, and so the limit exists. This verifies part (a) of Theorem 1.3. For part (b), we prove, (see (36)),
Lemma 3.1**.**
[TABLE]
Proof.
To prove this we show that if is the number of copies of in then implies that
[TABLE]
The inequality follows from a version of Azuma’s inequality (see (36)), and the lemma follows from taking a union bound over
[TABLE]
graphs . Note also that the term in (21) is bounded by the same term times the number of cycles of length at most in . The probability that this exceeds is certainly at most the RHS of (24). We will give details of our use of the Azuma inequality in Section 3.1. ∎
Part (b) of Theorem 1.3 follows by letting and from the Borel-Cantelli lemma.
3.1 A Model of
It is known that given that, up to relabeling vetices, is distributed as . The random graph is chosen uniformly from which is the set of graphs with vertex set , edges and minimum degree at least two.
3.1.1 Random Sequence Model
We must now take some time to explain the model we use for . We use a variation on the pseudo-graph model of Bollobás and Frieze [9] and Chvátal [10]. Given a sequence of integers between 1 and we can define a (multi)-graph with vertex set and edge set . The degree of is given by
[TABLE]
If is chosen randomly from then is close in distribution to . Indeed, conditional on being simple, is distributed as . To see this, note that if is simple then it has vertex set and edges. Also, there are distinct equally likely values of which yield the same graph.
Our situation is complicated by there being a lower bound of 2 on the minimum degree. So we let
[TABLE]
Let be the multi-graph for chosen uniformly from . It is clear then that conditional on being simple, has the same distribution as . It is important therefore to estimate the probability that this graph is simple. For this and other reasons, we need to have an understanding of the degree sequence when is drawn uniformly from . Let
[TABLE]
for .
Lemma 3.2**.**
Let be chosen randomly from . Let be independent copies of a truncated Poisson random variable , where
[TABLE]
Here satisfies
[TABLE]
Then is distributed as conditional on .
Proof.
This can be derived as in Lemma 4 of [2]. ∎
It follows from (14) and (26) and the fact that as that for large ,
[TABLE]
We note that the variance of is given by
[TABLE]
Furthermore,
[TABLE]
This is an example of a local central limit theorem. See for example, (5) of [2] or (3) of [17]. It follows by repeated application of (28) and (29) that if and then
[TABLE]
Let denote the number of vertices of degree in .
Lemma 3.3**.**
Suppose that . Let be chosen randomly from . Then as in equation (7) of [2], we have that with probability ,
[TABLE]
We can now show , is a good model for . For this we only need to show now that
[TABLE]
Again, this follows as in [2].
Given a tree with vertices of degrees and a fixed vertex we see that if is the probability that in then we have
[TABLE]
Explanation for (34): We use (30) to obtain the probability that the degrees of are . This explains the product . Implicit here is that , from (32). The contribution to the degree sum for can therefore be shown to be negligible. We use the fact that is small to argue that w.h.p. is induced. We choose the vertices, other than in ways and then counts the number of copies of in . We then choose the place in the sequence to put these edges in ways. Finally note that the probability the occurrences of the th vertex are as claimed is asymptotically equal to and this explains the factor .
Explanation for (35): We use the identity
[TABLE]
It only remains to verify (24). It follows from the above that . We first condition on a degree sequence x satisfying (31). We then work in the associated configuration model. We can generate a configuration as a permutation of the multi-set . Interchanging two elements in a permutation can only change by . We can therefore apply Azuma’s inequality to show that
[TABLE]
(Specifically we can use Lemma 11 of Frieze and Pittel [21] or Section 3.2 of McDiarmid [23].) This verifies (24).
4 Summary and open problems
We have derived an expression for the length of the longest path in that holds for large w.h.p. It would be interesting to have a more algebraic expression. Also, we could no doubt make this proof algorithmic, by using the arguments of Frieze and Haber [18]. It would be more interesting to do the analysis for small . Applying the coupling of McDiarmid [22] we see that the random digraph contains a path at least as long as that given by the R.H.S. of (6). It should be possible to improve this, just as Krivelevich, Lubetzky and Sudakov [20] did for the earlier result of [16].
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] M. Ajtai, J. Komlós and E. Szemerédi. The longest path in a random graph, Combinatorica 1 (1981) 1-12.
- 2[2] J. Aronson, A.M. Frieze and B.G. Pittel, Maximum matchings in sparse random graphs: Karp-Sipser re-visited, Random Structures and Algorithms 12 (1998) 111-178.
- 3[3] M. Bayati, D. Gamarnik and P. Tetali, Combinatorial approach to the interpolation method and scaling limits in sparse random graphs, The Annals of Probability 41 (2013) 4080-4115.
- 4[4] B. Bollobás, Long paths in sparse random graphs, Combinatorica 2 (1982) 223-228.
- 5[5] B. Bollobás, A probabilistic proof of an asymptotic formula for the number of labeled regular graphs , European Journal on Combinatorics 1 (1980) 311-316.
- 6[6] B. Bollobás, C. Cooper, T.I.Fenner and A.M.Frieze, On Hamilton cycles in sparse random graphs with minimum degree at least k 𝑘 k , Journal of Graph Theory 34 (2000) 42-59.
- 7[7] A.M.Frieze and B. Pittel. On a sparse random graph with minimum degree three: Likely Posa’s sets are large, Journal of Combinatorics 4 (2013) 123-156. [Co-author: ]
- 8[8] B.Bollobás, T.I.Fenner and A.M.Frieze, Long cycles in sparse random graphs, Graph theory and combinatorics, Proceedings of Cambridge Combinatorial Conference in honour of Paul Erdos (1984) 59-64.
