Upper tail bounds for cycles
Abigail Raz

TL;DR
This paper improves bounds on the probability of large deviations for cycle counts in random graphs, closing a gap in the upper tail estimates for cycles in Erdős–Rényi graphs.
Contribution
It refines the upper tail bounds for cycle counts in G_{n,p}, removing the logarithmic gap in the exponent for a broad range of p.
Findings
Established tight upper tail bounds for cycles in G_{n,p}.
Extended the validity of lower bounds to a wider p-range.
Provided precise asymptotics for cycle count deviations.
Abstract
This paper examines bounds on upper tails for cycle counts in . For a fixed graph define to be the number of copies of in . It is a much studied and surprisingly difficult problem to understand the upper tail of the distribution of , for example, to estimate \begin{equation*} \mathbb{P}(\xi_H > 2 \mathbb{E}\xi_H). \end{equation*} The best known result for general and is due to Janson, Oleszkiewicz, and Ruci\'nski, who, in 2004, proved \begin{align}\label{a:JOR} \exp[-O_{H, \eta}(M_H(n,p) \ln(1/p))]&<\mathbb{P}(\xi_H > (1+\eta)\mathbb{E} \xi_H)\\&<\exp[-\Omega_{H, \eta}(M_{H}(n,p))].\nonumber \end{align} Thus they determined the upper tail up to a factor of in the exponent. There has since been substantial work to improve these bounds for particular and . We close the gap for cycles, up to a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Upper tail bounds for cycles
Abigail Raz Department of Mathematics, Rutgers University, Piscataway NJ. Email: [email protected]
Abstract
This paper examines bounds on upper tails for cycle counts in . For a fixed graph define to be the number of copies of in . It is a much studied and surprisingly difficult problem to understand the upper tail of the distribution of , for example, to estimate
[TABLE]
The best known result for general and is due to Janson, Oleszkiewicz, and Ruciński, who, in 2004, proved
[TABLE]
Thus they determined the upper tail up to a factor of in the exponent. There has since been substantial work to improve these bounds for particular and . We close the gap for cycles, up to a constant in the exponent. Here the lower bound in (1) is the truth for -cycles when .
1 Introduction
Let be the usual (Erdős-Rényi) random graph. A copy of in is a subgraph of isomorphic to . It is a much-studied question to estimate, for and the number of copies of in ,
[TABLE]
To avoid irrelevancies we will always assume , where (see [20, pg. 56])
[TABLE]
(So in the case of cycles we assume .) Then is a threshold for (see [20, Theorem 3.4]). For smaller (and bounded ) the quantity in (2) is (see [20, Theorem 3.9] for a start).
Investigation the distribution of began in 1960 with Erdős and Rényi [10]. In the case of triangles it is easy to see that the upper tail is lower bounded by (since this is the probability that contains a complete graph on, say, vertices). This is, usually, much bigger than the naive guess, , a first indication that the problem is hard. In fact, not much was known about the upper tail until 2000 when Vu proved the first exponential tail bound in [22]. More information on what was known prior to 2002 can be found in [13]. A breakthrough occurred in 2004 when, in [15], Kim and Vu showed, using the “polynomial concentration method” of [14], that when is a triangle and ,
[TABLE]
The Kim-Vu bound for triangles was vastly extended by Janson, Oleszkiewicz, and Ruciński in 2004. To state their result we require the following definition:
[TABLE]
(As usual is fractional independence number (see e.g. [4]) and is maximum degree.)
Theorem 1.1**.**
[19, Theorem 1.2]** For any and ,
[TABLE]
(Note, is not quite the quantity used in [19], but as shown in their Theorem 1.5, the two quantities are equivalent up to a constant factor; so the difference is irrelevant here.)
Thus they determined the probability in (2) up to a factor of in the exponent for constant . This remains the best result for general and . The first progress towards closing the gap was made by Chatterjee in [5] and DeMarco and Kahn in [9] who independently closed it for triangles, showing that, for , the lower bound is the truth (up to the constant in the exponent). DeMarco and Kahn also gave the order of the exponent for smaller where the lower bound in (3) (namely ) is no longer the answer. Later, in [8], DeMarco and Kahn closed the gap for -cliques, showing that (for , , and )
[TABLE]
When is a “strictly balanced” graph and is small (). Warnke, in [23], used a combinatorial sparsification idea based on the BK inequality [3, 18] to close the gap, improving on work in [22, 21]. There was a breakthrough in 2016 when Chatterjee and Dembo introduced a “nonlinear large deviation” framework [6]. This has been used to close the gap for general and large (i.e. ) [6, 16]. Recently this technique was used, in [7], by Cook and Dembo to close the gap — including determining the correct constant in the exponent — for cycles when (among other results). Additionally, outside of the large deviation framework, Warnke and Šileikis, in [17], recently determined the correct upper tail bound for stars (including in the case where rather than a constant).
Here we settle the question for cycles (i.e. the order of magnitude of the exponent), where, with the -cycle denoted ,
[TABLE]
Formally, letting be the number of copies of in we prove:
Theorem 1.2**.**
For any fixed , , and ,
[TABLE]
We are most interested in the range where , so essentially when . As in [9], it is convenient to work with an -partite version of the random graph. Let be the random -partite graph on vertices where the vertex set is the disjoint union of -sets, say , and whenever and for some (all subscripts), these choices made independently. There are no edges between other pairs or within a . We always take to be a vertex of . A copy of in is any subgraph, with vertices isomorphic to . Note these are not all of the subgraphs of isomorphic to since we demand each vertex of the cycle is in a different . We denote the number of copies of in by . A copy of the path (denoted ) is any path isomorphic to (i.e. for ). We use to denote both copies of and copies of , since it will always be clear which interpretation is intended. We show the following bound.
Theorem 1.3**.**
For any fixed , , and ,
[TABLE]
That Theorem 1.3 implies Theorem 1.2 is likely well known and an easy generalization from the case which can be found in [9]. However, for completeness we will still give the general argument.
Proposition 1.4**.**
Theorem 1.3 implies Theorem 1.2.
This is proved in Section 2. The rest of the paper is organized as follows. Section 3 gives notation and states the two main assertions that give Theorem 1.3. These are proved in Sections 5-7, with Section 4 devoted to preliminaries.
2 Reduction
For completeness we give the proof of Proposition 1.4, following [9].
Proof of Proposition 1.4.
We first claim that it is enough to prove Proposition 1.4 for . Assuming we know Proposition 1.4 for we show it still holds when . Given and , we may assume is large (formally ). So, for example,
[TABLE]
Therefore,
[TABLE]
Note the second inequality holds since is a multiple of .
Now to prove Proposition 1.4 when let be as in Theorem 1.2, and set . We can choose by first choosing on and then selecting a uniform equipartition , and setting
[TABLE]
Note that, for any possible value of
[TABLE]
where . On the other hand, letting
[TABLE]
we have
[TABLE]
Combining (5) and (6) gives . We also have, by Theorem 1.3,
[TABLE]
Additionally, we know
[TABLE]
Here the final inequality holds since and, as we showed, is always at most . Since , Theorem 1.2 follows. ∎
3 Main Lemmas
Recall that we always take to be a vertex in ; indices are always written ; and copy of , copy of were defined just before the statement of Theorem 1.3. We use to denote the set of copies of in . Additionally, we abusively use just cycle for “copy of ” and full path for “copy of ”. As usual , , , and is the maximum degree in (we also use and ). Let
[TABLE]
We will abusively refer to as the degree of . For disjoint we use (resp. ) for the set of edges with one end in (resp. one end in each of ).
Much of the set-up that follows is borrowed from or inspired by [9]. Set and (so the exponent in (4) is ). For simplicity set and
[TABLE]
Note that for a fixed and , Theorem 1.2 is covered by Theorem 1.1. For us it is convenient to pick . Of course, the partite version (Theorem 1.3) was not considered in [19], but it is not too hard to get this from Theorem 1.1:
Proposition 3.1**.**
For Theorem 1.3 follows from Theorem 1.1.
This will be proved at the end of the section.
In view of Proposition 3.1, we may assume for the proof of Theorem 1.3 that
[TABLE]
We may also assume: — so also — is (fixed but) small (since (4) becomes weaker as grows); given and , is large (formally, ); and, say,
[TABLE]
(since for smaller , Theorem 1.3 is trivial for an appropriate ). We say that an event occurs with large probability (w.l.p.) if its probability is at least for some fixed and small enough . We write “” for “w.l.p. ”. Note that, assuming (9), an intersection of events that hold w.l.p. also holds w.l.p.
Let and let be the number of full paths with endpoints and in which each vertex is in the appropriate .
The next two assertions imply Theorem 1.3:
[TABLE]
[TABLE]
We prove (10) in Section 5 and (11) in Section 7. In Section 6 we prove that
[TABLE]
which will be used in the proof of (11).
We now give the proof of Proposition 3.1. To do so we require the following tail bound due to Janson ([12]; see also [20, Theorem 2.14]).
Lemma 3.2**.**
Let be a set of size and the random subset of in which each element is included with probability (independent of the other choices). Assume is a family of non-empty subsets of , and for each let . Additionally, let . Define
[TABLE]
Then for ,
[TABLE]
Proof of Proposition 3.1.
Let be as in Theorem 1.3 and regard as a subgraph of . Set , , and ; thus is the number of cycles in that are not of the form . Then . We first use Lemma 3.2 to show
[TABLE]
To apply Lemma 3.2 we take to be the set cycles in not of the form (so each is the edge set of a particular cycle). Note that when we have . Furthermore, the number of pairs of cycles sharing exactly edges is at most (for some constants ). Thus we have
[TABLE]
since . Lemma 3.2, with , gives
[TABLE]
Furthermore, we claim that for any
[TABLE]
provided and are such that . This is because occurrence of the event on the l.h.s. implies occurrence of one of the events on the r.h.s. ; namely, if
[TABLE]
then
[TABLE]
Therefore, for any we can select and such that
[TABLE]
where the second inequality holds by (13) and the third by Theorem 1.1. ∎
4 Preliminaries
To prove (10) and (11) we need the following preliminaries, where is used for a random variable with the binomial distribution . The first two of these are standard large deviation bounds; see e.g. [1, Theorem A.1.12], [20, Theorem 2.1(a)] and [2, Lemma 8.2]. The others are applications of Lemma 4.1 that we will use repeatedly.
Lemma 4.1**.**
For any , , , and we have,
[TABLE]
When and (which is what we have when our binomial random variable is or ) and we use for the right hand side of (15); that is,
[TABLE]
First note that for any () we have,
[TABLE]
Of course this is unnecessarily weak when is not close to 1 (as was the first bound in (4.1)), but is often enough for our purposes and will be used repeatedly below. It will also be useful to have the following upper bound on when (recall was defined before (7)):
[TABLE]
To show the first inequality holds note that and (see (8)) imply and
[TABLE]
Again implies giving the first inequality in (18):
[TABLE]
The second inequality in (18) follows easily from the combination of and the fact that is not extremely small (see (9)).
Lemma 4.2**.**
Suppose . Let be independent Bernoullis, , and . Then for any and ,
[TABLE]
The last two lemmas are the basis for much of what follows. Lemma 4.4 in particular may be regarded as perhaps the main idea for sections 5 and 6; it allows us to bound sums of atypically large degrees, which we then use to bound the number of cycles that include vertices of “large” degree (in Section 5) and the number of full paths without vertices of “large” degree (in Section 6).
Lemma 4.3**.**
For and any ,
[TABLE]
The first, ad hoc value is for use in Section 6 while the second will be used throughout. Convenient bounds for the second expression in (19) are
[TABLE]
Proof of Lemma 4.3.
Let and . We let because later it will be helpful to have . We can enforce this lower bound on because if then
[TABLE]
Without loss of generality, let . We show
[TABLE]
Write for the left hand side of (21). We first assume . Since the ’s () are independent copies of , two applications of Lemma 4.1 give
[TABLE]
The third inequality holds since , so .
Now assume . Recall from (17) that we always have
[TABLE]
So,
[TABLE]
implies
[TABLE]
On the other hand (9) gives
[TABLE]
The last inequality uses the fact that is minimized at and (as we may assume). Hence
[TABLE]
where the second inequality uses (and Lemma 4.1) and the (very crude) third inequality uses which follows from (22) and (9). ∎
Lemma 4.4**.**
For and any ,
[TABLE]
and
[TABLE]
There is nothing special about here; it is simply a value that will work for our purposes. The reason for the particular — and not very important — lower bound on will appear following (26).
Proof.
First we show (23). To slightly lighten the notation we fix and set
[TABLE]
We partition (where ), with
[TABLE]
It suffices to show
[TABLE]
Lemma 4.1 (using just (17)) gives
[TABLE]
Thus, for any ,
[TABLE]
For (26) we note that , so .
On the other hand, for (25) it is enough to show
[TABLE]
for some constant (not depending on ), where we sum over satisfying
[TABLE]
Here we can just bound the number of terms in (27) by the trivial
[TABLE]
while (in view of (28)) (26) bounds the individual summands in (27) by
[TABLE]
Moreover, the lemma’s lower bound on (or the weaker ) implies . So the left hand side of (27) is at most
[TABLE]
as desired.
To show (24) we now let . As before, we partition (where ) with
[TABLE]
It suffices to show
[TABLE]
[TABLE]
Thus, for any ,
[TABLE]
((30) follows from , in this case a very weak consequence of our assumed lower bound on .)
For (29) it is enough to show
[TABLE]
for some constant (not depending on ) where we sum over satisfying
[TABLE]
Again we can just bound the number of terms in (31) by the trivial
[TABLE]
while (in view of (32)) (30) bounds the individual summands by
[TABLE]
Again since the lemma’s lower bound on (or the weaker ) implies , the left hand side of (31) is at most
[TABLE]
as desired. ∎
We will also make use of the fact that for any , , and ,
[TABLE]
To see this let , and notice that
[TABLE]
Thus is maximized at , where it equals the r.h.s. of (33).
5 Proof of (10)
.
We first rule out very small , showing that when
[TABLE]
[TABLE]
so that (10) is vacuously true. For (34), with (and any vertex), Lemma 4.1 (and the union bound) give
[TABLE]
But for (which is the same as ), the r.h.s. of (35) is (note that (8) implies and the initial disappears because (9) makes a large multiple of ). Therefore for the remainder of the proof of (10) we may assume that
[TABLE]
We say has large degree if and intermediate degree if . We classify the cycles appearing in (10) according to the positions of their large and intermediate vertices. For disjoint , say is of type if
[TABLE]
and say a set of vertices is of type if each of its members is. We consider various possibilities for , always requiring that all vertices under discussion are of the given type. To begin note that since we are in (10) we have .
A little preview may be helpful. In each case we are trying to show that the size of the set of cycles in question is small relative to , so would like the number of possibilities for to be, in geometric average, somewhat less than . For example, for we do much better than this using Lemma 4.3, which, recall, bounds the number of ’s of such large degree by (or but here the is minor). On the other hand, for we have only the naive bound , which is clearly unaffordable. To control the number of such we rely on first selecting some (or ) and then bounding the number of choices for by (or ). If then given we simply use as a bound on the number of choices for . However if, for example, and we require Lemma 4.4 to bound the choices for (with ).
We now consider cycles of type . Here the absence of intermediate vertices will allow us to relax our assumption that there is at least one vertex of degree at least ; we will only need to assume that there is at least one vertex of degree at least . Let
[TABLE]
with subscripts interpreted. Note that implies only when . Here and in the future we will tend to somewhat abusively omit “w.l.p.” in situations where this is clearly what is meant. We will bound:
- (i)
for , the number of possibilities for ; 2. (ii)
for , the number of possibilities for ; 3. (iii)
given the choices in (ii), the number of possibilities for vertices of the cycle not chosen in (i) and (ii).
Note that the number of vertices chosen in (iii) is . The reason for treating in (ii) rather than (i) is (roughly) that it is through these vertices that we control the number of choices for the vertices that follow them (the ’s of (ii)). For (i) we just recall that Lemma 4.3 bounds the number of choices for (of large degree) by ; so the total number of possibilities in (i) is at most
[TABLE]
For as in (ii), the number of possibilities for is at most
[TABLE]
with the inequality given by Lemma 4.4. Thus the total number of possibilities in (ii) is at most
[TABLE]
Finally, we may choose the ’s in (iii) in an order for which each is chosen before (either because is chosen in (ii), or because precedes in our order; e.g. we can use any cyclic order that begins with an for which — if then , so all vertices were chosen in (i)). But since , the number of choices for given is at most .
Combining the above bounds we find that, for a given , the number of cycles of type is at most
[TABLE]
(using (7) for the last inequality). So, since there are fewer than possibilities for ,
[TABLE]
Next we consider cycles of type with . We may assume (at the cost of a negligible factor of in our eventual bound) that , and that is an index for which (which exists since we are in (10); again, we will pay a factor of for the choice of .) We further define
[TABLE]
We split into cases based on whether and/or . First assume and . We will bound:
- (i)
the number of possibilities for ; 2. (ii)
the number of possibilities for ; 3. (iii)
for the number of possibilities for ; 4. (iv)
for , the number of possibilities for ; 5. (v)
for , the number of possibilities for ; 6. (vi)
given the choices in (ii), (iv), and (v), the number of possibilities for vertices of the cycle not chosen in (i)-(v).
For (i) we just recall that Lemma 4.3 bounds the number of choices for by
[TABLE]
For (ii) the number of possibilities for is bounded by
[TABLE]
where the second inequality is given by Lemma 4.4.
For (iii), Lemma 4.3 bounds the number of choices for (of intermediate or large degree) by ; so the number of possibilities in (iii) is at most
[TABLE]
For as in (iv), the number of possibilities for is at most
[TABLE]
with the inequality given by Lemma 4.4. Thus the number of possibilities in (iv) is at most
[TABLE]
Similarly, the total number of possibilities in (v) is at most
[TABLE]
Finally, for (vi) we choose the remaining ’s with in increasing order (of their indices) and those with in decreasing order. In the first case, when we come to the number of possibilities is at most (since ), and similarly in the second case this number is at most since . Thus, the number of possibilities in (vi) is at most
[TABLE]
Combining the above bounds we find that, for a given and , the number of cycles of type is at most
[TABLE]
where the second inequality uses (33).
Now we assume , but . In this case (i), (iii), (iv), and (v) and their respective bounds all remain the same. However, now we replace (ii) with
- (ii*′*)
the number of possibilities for .
This is because will be selected in either (i), (iii), or (iv). Our new (ii*′*) is bounded by
[TABLE]
where the inequality comes from Lemma 4.4. Additionally, in (vi) there are now vertices left to choose. Thus our bound for (vi) becomes
[TABLE]
Combining these bounds with our previous bounds for (i) and (iii)-(v) we find that, for a given and , the number of cycles of type is at most
[TABLE]
where the second bound is again given by (33).
The argument for , is essentially identical to the preceding one, so we will not discuss it further.
It remains to consider the case when we have both and . Again, there is no change in (i) and (iii)-(v) and we replace (ii), in this case, by
- (ii*′′*)
the number of possibilities for
(since and will be among the vertices chosen in (i) and (iii)-(v)). By Lemma 4.3 the number of possibilities here (i.e. for ) is at most
[TABLE]
Additionally, in (vi) we are now selecting vertices; so, our bound becomes
[TABLE]
Again, combining bounds, we find that the number of cycles of type is at most
[TABLE]
So to recap, we have shown that, for any given , (where we assume and ) there are at most
[TABLE]
cycles of type .
Since there are fewer than choices for and the assumptions on and only cost a factor of , there are at most
[TABLE]
cycles of all types with ; recalling (see (37)) that we showed the same bound for the number of cycles of types (with ), we have the desired bound, , on the l.h.s. of (10).
∎
6 Proof of (12)
.
For the rest of our discussion we may ignore bad vertices, meaning those of degree at least , since cycles involving such vertices are excluded from (12). (Recall we are calling the degree of .)
What’s really going on here is as follows. We think of choosing after all other edges have been specified. The number of cycles (again, avoiding bad vertices) is then
[TABLE]
(recall is the number of full paths with endpoints and in which there are no bad vertices). Given , this is a weighted sum of independent binomials with expectation
[TABLE]
to which we may hope to apply the large deviation bound in Lemma 4.2. In this section we give a good (w.l.p.) bound on the sum in (39) (namely (12)). Once we have this, the only difficulty is that some of the “weights” may be too large to support finishing via the lemma. We will handle this difficulty in Section 7.
To prove (12) we first consider full paths in which each of has degree at most . There are at most
[TABLE]
such paths.
Now all the paths left to consider must have some (where ) such that . To count the number of such paths we split the argument based on . First assume
[TABLE]
(This is not a tight bound for either argument, but it is a convenient cut-off.) Given (41) we know
[TABLE]
for all (see (16) for the definition of ), so in applications of Lemma 4.3 we are always using the second value of (namely, ). Additionally since Lemma 4.4 applies. As in Section 5 we classify paths according to the positions of vertices with . For , say is of type if
[TABLE]
and say a set of vertices is of type if each of its members is either of type or in . Note we have already shown that there are at most
[TABLE]
full paths of type , so we now assume . Let be the smallest element of and let
[TABLE]
We will bound:
- (i)
for , the number of possibilities for ; 2. (ii)
for , the number of possibilities for ; 3. (iii)
given the choices in (ii), the number of possibilities for vertices of the path not chosen in (i) and (ii).
For i as in (i) we recall that by Lemma 4.3 the number of ’s of degree at least is at most . So, the total number of possibilities in (i) is at most
[TABLE]
For as in (ii), the number of possibilities for is at most
[TABLE]
with the inequality given by Lemma 4.4. Thus the total number of possibilities in (ii) is at most
[TABLE]
Finally for (iii) we choose the remaining ’s with in increasing order (of the indices). When we come to we know , so given there are at most choices for . If then we have selected all the vertices in the path. If not, then we next select . Since we are ignoring vertices of degree at least we know that given there are at most ways to select . If then we are done, and if not then we select the ’s with in decreasing order (of the indices). Since , given there are at most choices for . Thus, the number of possibilities in (iii) is at most
[TABLE]
Combining (42), (43), and the appropriate bound from (44) we find that, for a given , there are at most
[TABLE]
full paths of type (where the first inequality uses (33)). Since there are less than possibilities for there are at most
[TABLE]
full paths of type other than . Together with our earlier bound on the number of full paths of type this bounds the total number of full paths (without vertices of degree at least ) by
[TABLE]
as desired.
When
[TABLE]
we first note that we have a better bound on (the maximum degree) than . For (45) Lemma 4.1 with (and any vertex) gives
[TABLE]
using and absorbing the initial into the exponent (since (9) gives ). Thus, .
Given , let be minimal with . We first bound the number of cycles containing at least one with . Lemma 4.3 says there are at most such vertices (in all of ). Once such a vertex has been specified there are at most
[TABLE]
ways to select the remaining vertices in a full path containing . So, w.l.p. we have at most
[TABLE]
full paths containing at least one as above. (The quite weak follows from the lower and upper bounds on in (9) and (45), respectively.)
Now we count paths in which every vertex has degree at most and at least one vertex has degree at least (recalling that we have already treated those violating either condition). Say is of type if
[TABLE]
and let . We say the type of a path is the largest for which contains a vertex of type . Lemma 4.3 gives
[TABLE]
Note we have already bounded the number of full paths of type where . For smaller we think of specifying a path of type by choosing
- (i)
some of type , and then 2. (ii)
the remaining vertices of the path.
Here the bounds are easy: the number of possibilities in (i) is at most
[TABLE]
and the number of possibilities in (ii) is at most
[TABLE]
since, given the choice in (i), we may order the remaining choices so that each new vertex is drawn from the at most neighbors of some vertex chosen earlier. Thus the number of full paths of type is bounded by
[TABLE]
Summing over we find that w.l.p. there are at most
[TABLE]
full paths of all types up to (where the inequality follows easily from our choice of — see (7)). Adding (48) to the numbers of full paths with all degrees at most and those of type for ((40) and (46)) we find that w.l.p. there are at most
[TABLE]
full paths (with all vertices of degree at most ). So, regardless of , we have
[TABLE]
as desired.
∎
7 Proof of (11)
.
As explained at the start of Section 6 we want to use (12) and finish via Lemma 4.2, but some ’s may be too large to support this. To handle this difficulty we introduce the notion of a “heavy path” below. We then set
[TABLE]
and show
[TABLE]
[TABLE]
It will turn out that we need different definitions of “heavy path”, depending on . Either of these will say that the number of non-heavy paths, say , joining any satisfies
[TABLE]
(Recall .) We will return to the definitions of heavy path and the proof of (50) in Subsections 7.1 and 7.2; here we assume (51) and give the easy proof of (49).
As suggested above this is a straightforward application of Lemma 4.2. Let and . Then with
[TABLE]
and the indicator of the event we have
[TABLE]
In addition, recalling (12), we have
[TABLE]
Hence Lemma 4.2 with gives
[TABLE]
as desired.
7.1 Proof of (50) when
For we say is heavy if
[TABLE]
and is a heavy path if is heavy. (Note that here we have .) So, in this case the notion of heavy depends only on the endpoints of the path. Note that this definition trivially implies (51).
A brief indication of why we need two definitions of a heavy path may be helpful. In the present case (i.e. ) we bound the number of cycles for which is a heavy path by first bounding the number of ’s (and similarly ’s) that are in heavy paths. To do this we show that for to be in a heavy path there must be some for which is “large”, and we use this necessary condition to bound the number of ’s in heavy paths.
Let
[TABLE]
Thus every cycle, , considered in this section must have and . We first bound and , and then use this to bound . A necessary condition for is
[TABLE]
To see this, fix and recall that for every vertex under discussion in (11). Thus, we know that for any there are at most paths . To pick to complete such a path with we require . Thus if for all then for any ,
[TABLE]
(Here the middle inequality comes from (33) with and .) So in order to bound it suffices to bound the number of ’s satisfying (52).
Since , Lemma 4.1 (with , , and ) gives
[TABLE]
Note that (see (8)) implies , so
[TABLE]
Thus,
[TABLE]
The initial disappears since implies .
Next we show that w.l.p. and are at most . The lemma will be stated in more generality as we will use it again after (7.1).
Lemma 7.1**.**
If and is a random subset of in which each is included independently with probability at most then .
Proof.
Here we apply Lemma 4.1 with , and . Note that since we know, say, ; so Lemma 4.1 gives
[TABLE]
∎
Hence .
We next show that for any
[TABLE]
We use (53) to bound (and again after (7.1)). To prove (53) we assume and are of the appropriate sizes and apply Lemma 4.1 with , , and . Note that , and, generously, . Also, since , we have . So for a given and of the appropriate size Lemma 4.1 gives
[TABLE]
Simply taking the union bound with the first sum over all possible and the next two over all we have
[TABLE]
It is easy to see (using and ) that for we have
[TABLE]
So (54) is, for example, at most . Therefore w.l.p.
[TABLE]
as desired. Specifically we have (w.l.p.)
[TABLE]
We next want to bound the number of full paths between and . For let
[TABLE]
We first bound the number of full paths such that at least one vertex in the path is not in the appropriate . Fixing , , and an index we bound the number of full paths with . Since for all under consideration, there are at most
[TABLE]
ways to choose with and
[TABLE]
ways to choose with . To complete the path we must have . Since we assume , there are at most choices for . Thus there are at most
[TABLE]
paths from to with .
If then we instead bound the number of choices for by
[TABLE]
and the number of ways to choose with by
[TABLE]
To complete the path we must have . Again, as we are assuming , there are at most choices for . So, there are at most
[TABLE]
paths from to with .
Now summing over , there are at most paths using at least one vertex outside of , and combining this with (56) bounds the number of cycles as in (50) (with some vertex outside of ) by
[TABLE]
The only cycles left to count are those with for all . We first bound . Lemma 4.1 with , , and (and the union bound) gives, for any ,
[TABLE]
As before, implies the r.h.s. of (58) is at most
[TABLE]
Hence,
[TABLE]
Again the initial disappears since implies . Given (7.1) Lemma 7.1 gives . Assuming this, (55) gives
[TABLE]
for all .
To finish the proof (for ) we use the following lemma due to Shearer [11]. We will use this lemma again when . To state it we require the following definition. (Recall a hypergraph on is simply a collection — possibly with repeats — of subsets of .)
For a hypergraph on the vertex set and , the trace of on is defined to be
[TABLE]
Lemma 7.2**.**
Suppose is a hypergraph on and is another hypergraph on such that every vertex in belongs to at least edges of . Then
[TABLE]
To apply Lemma 7.2 here, let be the hypergraph on whose edges are the vertex sets of cycles using only vertices in . So is the number of cycles using only vertices in . Let be the hypergraph on with edges . Thus each vertex belongs to exactly two edges of . Furthermore
[TABLE]
Thus Lemma 7.2 gives
[TABLE]
Combining this with (57) gives (50) (for ).
7.2 Proof of (50) when
For we need the following definitions for and
[TABLE]
That is, if, for some , has at least 5 neighbors in that are “directly reachable” from . We say a path is heavy if for some . Note (as promised) we still have (51), since
[TABLE]
(Again recall .)
In this section we are bounding the number of cycles containing at least one vertex in some . To do this we fix and bound the number of cycles with .
We first observe that
[TABLE]
(where, as usual, is the maximim degree in .) For (63) Lemma 4.1 with (and any vertex), together with the union bound, gives
[TABLE]
So we may assume , whence, for any and ,
[TABLE]
Note that (since ).
We next show
[TABLE]
Here, for a given , we may think of — which does not depend on edges involving — as given. Then for a given we have (using (64))
[TABLE]
so applying Lemma 4.1 with and bounds the r.h.s. of (66) by
[TABLE]
Another application of Lemma 4.1, with , , and now gives (65):
[TABLE]
We may thus assume from now on that .
Given we bound the number of cycles with . This requires the following definitions (for ):
[TABLE]
(Note we are reading subscripts.)
Thus if and only if some cycle containing meets . We also set
[TABLE]
To bound the number of cycles involving some we need a bound on , but will actually bound the (larger) quantity
[TABLE]
As elsewhere the point here is to retain some independence; given , and do not depend on . Thus, having specified we may think of first exposing the edges of not involving — thus determining and — at which point is just a binomial to which we may apply Lemma 4.1. Note, however, that will not be independent of the choice of , so we will need to take a union bound over possibilities for .
We will show
[TABLE]
The eventual punchline here will be an application of Lemma 7.2 (Shearer’s Lemma) similar to the one in Section 7.1. This is the reason for the which, in applying the lemma will be raised to the power .
Note that for all we have (very crudely in most cases)
[TABLE]
We apply Lemma 4.1 with
[TABLE]
A little checking (using ) confirms that, for example,
[TABLE]
Thus for specified and Lemma 4.1 gives
[TABLE]
and summing over possibilities for and (recalling that we have ) gives (67):
[TABLE]
Here for the final bound we use that and is small enough (see (7)).
To apply Lemma 7.2 here let be the hypergraph on where each edge is the vertex set of a cycle using only vertices in . Again let be the hypergraph on with edges . Thus each vertex belongs to exactly two edges of . Furthermore, (67) says
[TABLE]
Thus Lemma 7.2 gives
[TABLE]
as desired. So, summing over choices for , there are less than cycles using some , as desired. ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] N. Alon and J.H Spencer “The Probabilistic Method” New York: Wiley, 2015
- 2[2] J. Beck and W. Chen “Irregularities of Distribution” Cambridge: Cambridge Univ. Pr., 1987
- 3[3] J. Den Berg and H. Kesten “Inequalities with Applications to Percolation and Reliability” In Journal of Applied Probability 22.3 , 1985, pp. 556–569
- 4[4] B. Bollobás “Modern Graph Theory” New York: Springer, 1998
- 5[5] S. Chatterjee “The missing log \log in large deviations for triangle counts” In Random Structures & Algorithms 40 , 2011, pp. 437–451
- 6[6] S. Chatterjee and A. Dembo “Nonlinear large deviations” In Advances in Mathematics 299 , 2016, pp. 396–450
- 7[7] N.. Cook and A. Dembo “Large deviations of subgraph counts for sparse Erdős-Rényi graphs” In Ar Xiv e-prints , 2018 ar Xiv: 1809.11148 [math.PR]
- 8[8] R. De Marco and J. Kahn “Tight upper tails bounds for cliques” In Random Structures & Algorithms 41.4 , 2012, pp. 469–487 URL: https://onlinelibrary.wiley.com/doi/abs/10.1002/rsa.20440
