Upper tail bounds for Stars

Matas \v{S}ileikis; Lutz Warnke

arXiv:1901.10637·math.PR·April 6, 2021·Electron. J. Comb.

Upper tail bounds for Stars

Matas \v{S}ileikis, Lutz Warnke

PDF

TL;DR

This paper derives nearly optimal exponential bounds for the upper tail probabilities of the number of r-armed stars in a binomial random graph, extending previous results to a broader range of deviations.

Contribution

It establishes the best possible exponential bounds for upper tail probabilities of star counts in G_{n,p}, solving a problem posed by Janson and Rucinski and confirming a conjecture by DeMarco and Kahn.

Findings

01

Derived exponential bounds are tight up to constant factors.

02

Extended upper tail analysis to deviations larger than constant .

03

Confirmed conjecture and solved open problem for star counts.

Abstract

For r \ge 2, let X be the number of r-armed stars K_{1,r} in the binomial random graph G_{n,p}. We study the upper tail \Pr(X \ge (1+\epsilon)\E X), and establish exponential bounds which are best possible up to constant factors in the exponent (for the special case of stars K_{1,r} this solves a problem of Janson and Rucinski, and confirms a conjecture by DeMarco and Kahn). In contrast to the widely accepted standard for the upper tail problem, we do not restrict our attention to constant \epsilon, but also allow for \epsilon \ge n^{-\alpha} deviations.

Equations135

p^{O_{H, ε} (n^{2} p^{r})} \leq P (X_{H} \geq (1 + ε) E X_{H}) \leq e^{- Ω_{H, ε} (n^{2} p^{r})},

p^{O_{H, ε} (n^{2} p^{r})} \leq P (X_{H} \geq (1 + ε) E X_{H}) \leq e^{- Ω_{H, ε} (n^{2} p^{r})},

p^{O_{r, ε} (n^{1 + 1/ r} p)} \leq P (X_{K_{1, r}} \geq (1 + ε) E X_{K_{1, r}}) \leq e^{- Ω_{r, ε} (n^{1 + 1/ r} p)} .

p^{O_{r, ε} (n^{1 + 1/ r} p)} \leq P (X_{K_{1, r}} \geq (1 + ε) E X_{K_{1, r}}) \leq e^{- Ω_{r, ε} (n^{1 + 1/ r} p)} .

p^{O_{r, ε} (m a x {n^{1 + 1/ r} p, n^{2} p^{r}})} \leq P (X_{K_{1, r}} \geq (1 + ε) E X_{K_{1, r}}) \leq e^{- Ω_{r, ε} (m a x {n^{1 + 1/ r} p, n^{2} p^{r}})} .

p^{O_{r, ε} (m a x {n^{1 + 1/ r} p, n^{2} p^{r}})} \leq P (X_{K_{1, r}} \geq (1 + ε) E X_{K_{1, r}}) \leq e^{- Ω_{r, ε} (m a x {n^{1 + 1/ r} p, n^{2} p^{r}})} .

-\log{\mathbb{P}}(X\geq(1+\varepsilon)\mu)=\Theta_{r,\varepsilon}\bigl{(}\Phi\bigr{)}\quad\text{ with }\quad\Phi:=\min\Bigl{\{}\mu,\;\max\bigl{\{}\mu^{1/r},\mu/n^{r-1}\bigr{\}}\log(1/p)\Bigr{\}}.

-\log{\mathbb{P}}(X\geq(1+\varepsilon)\mu)=\Theta_{r,\varepsilon}\bigl{(}\Phi\bigr{)}\quad\text{ with }\quad\Phi:=\min\Bigl{\{}\mu,\;\max\bigl{\{}\mu^{1/r},\mu/n^{r-1}\bigr{\}}\log(1/p)\Bigr{\}}.

-\log{\mathbb{P}}(X\geq(1+\varepsilon)\mu)=\Theta_{r,\xi}\bigl{(}\Phi(\varepsilon)\bigr{)},

-\log{\mathbb{P}}(X\geq(1+\varepsilon)\mu)=\Theta_{r,\xi}\bigl{(}\Phi(\varepsilon)\bigr{)},

\Phi(\varepsilon):=\min\Bigl{\{}\varphi(\varepsilon)\mu^{2}/\sigma^{2},\;\max\bigl{\{}(\varepsilon\mu)^{1/r},(\varepsilon\mu)/n^{r-1}\bigr{\}}\log(e/p)\Bigr{\}}.

\Phi(\varepsilon):=\min\Bigl{\{}\varphi(\varepsilon)\mu^{2}/\sigma^{2},\;\max\bigl{\{}(\varepsilon\mu)^{1/r},(\varepsilon\mu)/n^{r-1}\bigr{\}}\log(e/p)\Bigr{\}}.

-\log{\mathbb{P}}(X\geq\mu+t)=\Theta_{r,\xi}\bigl{(}\Psi(t)\bigr{)}\quad\text{ with }\quad\Psi(t):=\min\Bigl{\{}t^{2}/\sigma^{2},\;M(t)\log(e/p)\Bigr{\}}.

-\log{\mathbb{P}}(X\geq\mu+t)=\Theta_{r,\xi}\bigl{(}\Psi(t)\bigr{)}\quad\text{ with }\quad\Psi(t):=\min\Bigl{\{}t^{2}/\sigma^{2},\;M(t)\log(e/p)\Bigr{\}}.

-\log{\mathbb{P}}(X\geq\mu+t)=\Theta_{r,\xi}\bigl{(}M(t)\log(e/p)\bigr{)}\quad\text{ with }\quad M(t):=\max\Bigl{\{}t^{1/r},t/n^{r-1}\Bigr{\}}.

-\log{\mathbb{P}}(X\geq\mu+t)=\Theta_{r,\xi}\bigl{(}M(t)\log(e/p)\bigr{)}\quad\text{ with }\quad M(t):=\max\Bigl{\{}t^{1/r},t/n^{r-1}\Bigr{\}}.

N_{D_{j}} < \frac{β M}{D _{j}} for all j \in N = {0, 1, \dots},

N_{D_{j}} < \frac{β M}{D _{j}} for all j \in N = {0, 1, \dots},

\begin{split}M=M(t)&:=\max\bigl{\{}t^{1/r},\>t/n^{r-1}\bigr{\}},\\ D_{j}=D_{j}(D)&:=2^{j}D.\end{split}

\begin{split}M=M(t)&:=\max\bigl{\{}t^{1/r},\>t/n^{r-1}\bigr{\}},\\ D_{j}=D_{j}(D)&:=2^{j}D.\end{split}

X\leq X_{D}+\sum_{0\leq j<J}\bigl{(}2N_{D_{j}}D_{j}\cdot 4D_{j}^{r-1}\bigr{)}\leq X_{D}+8\beta M\cdot\sum_{\begin{subarray}{c}j\in\mathbb{N}:D_{j}\leq{\overline{M}}\end{subarray}}D_{j}^{r-1}.

X\leq X_{D}+\sum_{0\leq j<J}\bigl{(}2N_{D_{j}}D_{j}\cdot 4D_{j}^{r-1}\bigr{)}\leq X_{D}+8\beta M\cdot\sum_{\begin{subarray}{c}j\in\mathbb{N}:D_{j}\leq{\overline{M}}\end{subarray}}D_{j}^{r-1}.

X - X_{D} \leq 8 β M \cdot 2 \overline{M}^{r - 1} \leq 16 β \cdot min {M^{r}, M n^{r - 1}} \leq t /2,

X - X_{D} \leq 8 β M \cdot 2 \overline{M}^{r - 1} \leq 16 β \cdot min {M^{r}, M n^{r - 1}} \leq t /2,

{\mathbb{P}}(Z_{C}\geq\mu+t)\leq\exp\left(-\frac{\varphi(t/\mu)\mu}{C}\right)\leq\exp\biggl{(}-\frac{t^{2}}{2C(\mu+t)}\biggr{)}.

{\mathbb{P}}(Z_{C}\geq\mu+t)\leq\exp\left(-\frac{\varphi(t/\mu)\mu}{C}\right)\leq\exp\biggl{(}-\frac{t^{2}}{2C(\mu+t)}\biggr{)}.

P (X_{D} \geq μ + t /2) \leq exp (- \frac{φ ( t / μ ) μ}{16 D ^{r - 1}}) \leq exp (- \frac{min { t ^{2} / μ , t }}{48 D ^{r - 1}}) .

P (X_{D} \geq μ + t /2) \leq exp (- \frac{φ ( t / μ ) μ}{16 D ^{r - 1}}) \leq exp (- \frac{min { t ^{2} / μ , t }}{48 D ^{r - 1}}) .

β \in J max ∣ {α \in J : α \cap β \neq = \emptyset} ∣ \leq r \cdot 2 (r - 1 ⌊ D ⌋) \leq \frac{2 r D ^{r - 1}}{( r - 1 )!} \leq 4 D^{r - 1} =: C .

β \in J max ∣ {α \in J : α \cap β \neq = \emptyset} ∣ \leq r \cdot 2 (r - 1 ⌊ D ⌋) \leq \frac{2 r D ^{r - 1}}{( r - 1 )!} \leq 4 D^{r - 1} =: C .

φ (x /2) \geq φ (x) /4 and x^{2} \geq φ (x) \geq min {x, x^{2}} /3.

φ (x /2) \geq φ (x) /4 and x^{2} \geq φ (x) \geq min {x, x^{2}} /3.

P (X_{D} \geq μ + t /2) \leq P (Z_{C} \geq μ + t /2) \leq exp (- \frac{φ ( t / μ ) μ}{4 C}) \leq exp (- \frac{min { t , t ^{2} / μ }}{12 C}),

P (X_{D} \geq μ + t /2) \leq P (Z_{C} \geq μ + t /2) \leq exp (- \frac{φ ( t / μ ) μ}{4 C}) \leq exp (- \frac{min { t , t ^{2} / μ }}{12 C}),

\bigl{(}e^{3}np/D\bigr{)}^{D}\leq n^{-8}

\bigl{(}e^{3}np/D\bigr{)}^{D}\leq n^{-8}

P (N_{D_{j}} \geq x) \leq \frac{1}{n ^{3}} (\frac{n p}{e ⌈ D _{j} ⌉})^{x D_{j} /2} \mathbbm 1_{{D_{j} \leq n}} .

P (N_{D_{j}} \geq x) \leq \frac{1}{n ^{3}} (\frac{n p}{e ⌈ D _{j} ⌉})^{x D_{j} /2} \mathbbm 1_{{D_{j} \leq n}} .

P (N_{D_{j}} \geq x) \leq n^{⌈ x ⌉} (⌈ D _{j} ⌉ n)^{⌈ x ⌉} p^{⌈ x ⌉ ⌈ D_{j} ⌉} \leq (n (\frac{e n p}{⌈ D _{j} ⌉})^{⌈ D_{j} ⌉})^{⌈ x ⌉}

P (N_{D_{j}} \geq x) \leq n^{⌈ x ⌉} (⌈ D _{j} ⌉ n)^{⌈ x ⌉} p^{⌈ x ⌉ ⌈ D_{j} ⌉} \leq (n (\frac{e n p}{⌈ D _{j} ⌉})^{⌈ D_{j} ⌉})^{⌈ x ⌉}

(\frac{e n p}{⌈ D _{j} ⌉})^{⌈ D_{j} ⌉} = (\frac{e ^{3} n p}{⌈ D _{j} ⌉})^{\frac{⌈ D _{j} ⌉}{2}} (\frac{n p}{e ⌈ D _{j} ⌉})^{\frac{⌈ D _{j} ⌉}{2}} \leq (\frac{e ^{3} n p}{D})^{\frac{D}{2}} \cdot (\frac{n p}{e ⌈ D _{j} ⌉})^{\frac{⌈ D _{j} ⌉}{2}} \leq n^{- 4} (\frac{n p}{e ⌈ D _{j} ⌉})^{⌈ D_{j} ⌉ /2} .

(\frac{e n p}{⌈ D _{j} ⌉})^{⌈ D_{j} ⌉} = (\frac{e ^{3} n p}{⌈ D _{j} ⌉})^{\frac{⌈ D _{j} ⌉}{2}} (\frac{n p}{e ⌈ D _{j} ⌉})^{\frac{⌈ D _{j} ⌉}{2}} \leq (\frac{e ^{3} n p}{D})^{\frac{D}{2}} \cdot (\frac{n p}{e ⌈ D _{j} ⌉})^{\frac{⌈ D _{j} ⌉}{2}} \leq n^{- 4} (\frac{n p}{e ⌈ D _{j} ⌉})^{⌈ D_{j} ⌉ /2} .

A := max {e^{4}, 8/ γ}, s := lo g (e / p^{γ}), and D := A \cdot max {1, \frac{min { μ ^{1/ r} , n }}{s ^{1/ (r - 1)}}} .

A := max {e^{4}, 8/ γ}, s := lo g (e / p^{γ}), and D := A \cdot max {1, \frac{min { μ ^{1/ r} , n }}{s ^{1/ (r - 1)}}} .

d n^{r + 1} p^{r} \leq μ \leq n^{r + 1} p^{r} .

d n^{r + 1} p^{r} \leq μ \leq n^{r + 1} p^{r} .

P (X \geq (1 + ε) μ) \leq P (X_{D} \geq μ + ε μ /2) + P (\neg T (β, D, ε μ)) .

P (X \geq (1 + ε) μ) \leq P (X_{D} \geq μ + ε μ /2) + P (\neg T (β, D, ε μ)) .

\frac{n p}{eD} = \frac{n p ^{1 - γ} e ^{- s}}{D} \leq \mathbbm 1_{{p \leq n^{- 1/ (1 - γ)}}} \frac{e ^{- s}}{A} + \mathbbm 1_{{p > n^{- 1/ (1 - γ)}}} \frac{e ^{- s}}{A min { d ^{1/ r} n ^{1/ r} p ^{2 γ} , p ^{2 γ - 1} }} \leq \frac{e ^{- s}}{A} \leq e^{- s},

\frac{n p}{eD} = \frac{n p ^{1 - γ} e ^{- s}}{D} \leq \mathbbm 1_{{p \leq n^{- 1/ (1 - γ)}}} \frac{e ^{- s}}{A} + \mathbbm 1_{{p > n^{- 1/ (1 - γ)}}} \frac{e ^{- s}}{A min { d ^{1/ r} n ^{1/ r} p ^{2 γ} , p ^{2 γ - 1} }} \leq \frac{e ^{- s}}{A} \leq e^{- s},

\bigl{(}e^{3}np/D\bigr{)}^{D}\leq(p^{\gamma}/e)^{D}\leq\mathbbm{1}_{\{{p\leq n^{-1}}\}}n^{-A\gamma}+\mathbbm{1}_{\{{p>n^{-1}}\}}e^{-Anp^{1-\gamma}}\leq n^{-8}.

\bigl{(}e^{3}np/D\bigr{)}^{D}\leq(p^{\gamma}/e)^{D}\leq\mathbbm{1}_{\{{p\leq n^{-1}}\}}n^{-A\gamma}+\mathbbm{1}_{\{{p>n^{-1}}\}}e^{-Anp^{1-\gamma}}\leq n^{-8}.

P (\neg T (β, D, ε μ)) \leq j \in N \sum P (N_{D_{j}} \geq β M / D_{j}) \leq n \cdot \frac{1}{n ^{3}} (\frac{n p}{eD})^{β M /2} \leq \frac{1}{n ^{2}} \cdot e^{- β M s /2} .

P (\neg T (β, D, ε μ)) \leq j \in N \sum P (N_{D_{j}} \geq β M / D_{j}) \leq n \cdot \frac{1}{n ^{3}} (\frac{n p}{eD})^{β M /2} \leq \frac{1}{n ^{2}} \cdot e^{- β M s /2} .

\begin{split}{\mathbb{P}}(X\geq(1+\varepsilon)\mu)&\leq\exp\left(-\frac{\min\{\varepsilon,\varepsilon^{2}\}\mu}{48D^{r-1}}\right)+\frac{1}{n^{2}}\exp\left(-\frac{\beta Ms}{2}\right)\\ &\leq(1+n^{-2})\cdot\exp\Bigl{(}-c\underbrace{\min\{\varepsilon,\>\varepsilon^{2},\>\varepsilon^{1/r}\}}_{=:\zeta}\underbrace{\min\bigl{\{}\mu,\;\max\bigl{\{}\mu^{1/r},\>\mu/n^{r-1}\bigr{\}}s\bigr{\}}}_{=:\Pi}\Bigr{)}.\end{split}

\begin{split}{\mathbb{P}}(X\geq(1+\varepsilon)\mu)&\leq\exp\left(-\frac{\min\{\varepsilon,\varepsilon^{2}\}\mu}{48D^{r-1}}\right)+\frac{1}{n^{2}}\exp\left(-\frac{\beta Ms}{2}\right)\\ &\leq(1+n^{-2})\cdot\exp\Bigl{(}-c\underbrace{\min\{\varepsilon,\>\varepsilon^{2},\>\varepsilon^{1/r}\}}_{=:\zeta}\underbrace{\min\bigl{\{}\mu,\;\max\bigl{\{}\mu^{1/r},\>\mu/n^{r-1}\bigr{\}}s\bigr{\}}}_{=:\Pi}\Bigr{)}.\end{split}

{\mathbb{P}}(X\geq(1+\varepsilon)\mu)\leq\frac{1}{1+\varepsilon}=1-\frac{\varepsilon}{1+\varepsilon}\leq\exp\left(-\frac{\varepsilon}{1+\varepsilon}\right)\leq\exp\Bigl{(}-\tfrac{c}{2}\min\{\varepsilon,1\}\zeta\Pi\Bigr{)}.

{\mathbb{P}}(X\geq(1+\varepsilon)\mu)\leq\frac{1}{1+\varepsilon}=1-\frac{\varepsilon}{1+\varepsilon}\leq\exp\left(-\frac{\varepsilon}{1+\varepsilon}\right)\leq\exp\Bigl{(}-\tfrac{c}{2}\min\{\varepsilon,1\}\zeta\Pi\Bigr{)}.

N_{D_{j}}

N_{D_{j}}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Upper tail bounds for Stars

Matas Šileikis and Lutz Warnke Department of Theoretical Computer Science, Institute of Computer Science of the Czech Academy of Sciences, 182 07 Prague, Czech Republic. E-mail: [email protected]. With institutional support RVO:67985807. Research supported by the Czech Science Foundation, grant number GJ16-07822Y.School of Mathematics, Georgia Institute of Technology, Atlanta GA 30332, USA. E-mail: [email protected]. Research partially supported by NSF Grant DMS-1703516 and a Sloan Research Fellowship. Part of the work was done while the author was a member of the Department of Pure Mathematics and Mathematical Statistics, University of Cambridge.

(29 January 2019)

Abstract

For $r\geq 2$ , let $X$ be the number of $r$ -armed stars $K_{1,r}$ in the binomial random graph $G_{n,p}$ . We study the upper tail ${\mathbb{P}}(X\geq(1+\varepsilon){\mathbb{E}}X)$ , and establish exponential bounds which are best possible up to constant factors in the exponent (for the special case of stars $K_{1,r}$ this solves a problem of Janson and Ruciński, and confirms a conjecture by DeMarco and Kahn). In contrast to the widely accepted standard for the upper tail problem, we do not restrict our attention to constant $\varepsilon$ , but also allow for $\varepsilon\geq n^{-\alpha}$ deviations.

1 Introduction

The study of (the distribution of) small subgraphs in the binomial random graph $G_{n,p}$ is one of the most fundamental and influential problems in the theory of random graphs. Starting with the seminal work of Erdős and Rényi [11] from 1960, the early results for the number $X_{H}$ of copies of $H$ in $G_{n,p}$ concerned the threshold of appearance (i.e., when ${\mathbb{P}}(X_{H}>0)\to 1$ ) and the range of edge-probabilities $p$ for which $X_{H}$ is asymptotically normal; these basic features were eventually resolved in the 1980s by Bollobás [2] and Ruciński [25]. Later the focus changed to finer details of the distribution of $X_{H}$ , and the lower tail ${\mathbb{P}}(X_{H}\leq(1-\varepsilon){\mathbb{E}}X_{H})$ was studied intensively in the late 1980s (often for the special case $\varepsilon=1$ ). This led to the discovery of Janson’s inequality [13, 14, 24], which gives exponential bounds for ${\mathbb{P}}(X_{H}\leq(1-\varepsilon){\mathbb{E}}X_{H})$ that are best possible up to constant factors in the exponent (cf. the recent work of Janson and Warnke [20]).

Since the early 1990s the ‘infamous’ upper tail ${\mathbb{P}}(X_{H}\geq(1+\varepsilon){\mathbb{E}}X_{H})$ has remained an important challenge, providing a well-known testbed for concentration inequalities (see, e.g., [16]). After polynomial bounds around 1990 by Spencer [29] and exponential bounds in the late 1990s via the Kim–Vu polynomial concentration method [21, 30], in 2002 Janson, Oleszkiewicz and Ruciński [17] obtained a breakthrough: via a moment based method they obtained exponential estimates for ${\mathbb{P}}(X_{H}\geq(1+\varepsilon){\mathbb{E}}X_{H})$ that, for constant $\varepsilon$ , are best possible up to logarithmic factors in the exponent (see also [9, 19] for extensions to random hypergraphs, and arithmetic progressions in random subsets of integers). The upper tail problem of closing the aforementioned logarithmic gap has remained open during the last decade, and only recently this has been settled for cliques $K_{r}$ by DeMarco and Kahn [6, 7] (see also Chatterjee [3] for $r=3$ ), and for arithmetic progressions by Warnke [31]. Modern large deviation theory also gives partial results [4, 23, 1, 8] for large edge-probabilities $p\geq n^{-\delta_{H}}$ (this restriction sidesteps some difficulties of the upper tail problem).

In this paper we solve the upper tail problem for $r$ -armed stars $K_{1,r}$ , and as a conceptual novelty we will also allow $\varepsilon$ to depend on $n$ (i.e., do not restrict our attention to constant $\varepsilon$ , as usual). The casual reader might suspect that tail estimates for $r$ -armed stars are essentially trivial, but this is only true for $r=1$ (where $X_{K_{1,1}}=|E(G_{n,p})|$ since $K_{1,1}=K_{2}$ ). To put this into context, Janson, Oleszkiewicz and Ruciński [17] proved that for $r$ -regular graphs $H$ , such as cliques $K_{r+1}$ , the upper tail satisfies

[TABLE]

where the subscripts in $O_{H,\varepsilon}$ and $\Omega_{H,\varepsilon}$ indicate that the implicit constants may depend on $H$ and $\varepsilon$ . They also highlighted $K_{1,r}$ (with $r\geq 2$ ) as key example where the form of the exponent is more complicated, i.e, has different expressions for different ranges of $p$ . This surprising intricacy is further manifested by the history of the infamous upper tail problem. Namely, Vu [30] argued in 2000 that his general results were essentially unimprovable due to $r$ -armed stars, for which he obtained bounds of the form

[TABLE]

However, Janson, Oleszkiewicz and Rucinski [17] later discovered that the upper tail behavior of $K_{1,r}$ is more delicate (the lower bound in (1) is not always correct), and obtained bounds of the more involved form

[TABLE]

In words, for stars the form of the upper tail changes around $p\approx n^{-1/r}$ , which is an intriguing phenomenon (that does not occur for cliques). In fact, a recent conjecture of DeMarco and Kahn [7, 28] for general $H$ suggests that in (2) the ‘correct’ exponent involves yet another term $\mu:={\mathbb{E}}X_{K_{1,r}}=\Theta_{r}(n^{r+1}p^{r})$ , see (3). However, despite some partial results [26, 27, 33], the quest for matching bounds in (1)–(2) remained open.

1.1 Main results

Our first basic result settles the upper tail problem of $r$ -armed stars for constant $\varepsilon$ , by closing the existing $\log(1/p)$ gap in the exponent for all $p\in(0,1]$ . In particular, (3) below confirms111Using Corollary 1.8 in [17] and the discussion of Remark 8.3 in [17] it is not difficult to check that the special case $H=K_{1,r}$ of Conjecture 10.1 in [7] indeed reduces to (3) with $\varepsilon=1$ ; see also equation (4.27) in [27] and Remark 2 in [26]. Conjecture 10.1 of DeMarco and Kahn [7] in the special case $H=K_{1,r}$ . For subgraph counts this is the first example of a sharp upper tail estimate where, for constant $\varepsilon$ , the form of the exponent undergoes multiple phases (i.e., has more more than two different expressions for different ranges of $p$ ).

Theorem 1 (Upper tail problem for constant $\varepsilon$ ).

Given $r\geq 2$ , let $X=X_{r,n,p}$ be the number of copies of $K_{1,r}$ in $G_{n,p}$ . Set $\mu:={\mathbb{E}}X$ . For $p\in(0,1]$ and $\varepsilon>0$ satisfying $1\leq(1+\varepsilon)\mu\leq X_{r,n,1}$ we have

[TABLE]

Note that the assumption $(1+\varepsilon)\mu\leq X_{r,n,1}$ is necessary (since $X>X_{r,n,1}$ is impossible), and that the assumption $(1+\varepsilon)\mu\geq 1$ is natural (since otherwise ${\mathbb{P}}(X\geq(1+\varepsilon)\mu)={\mathbb{P}}(X\geq 1)=1-{\mathbb{P}}(X=0)$ holds). We now motivate the intricate form of the exponent in (3) for $\varepsilon=1$ . First, Poisson approximation heuristics suggest that ${\mathbb{P}}(X\geq 2\mu)\approx e^{-\Theta(\mu)}$ for small edge-probabilities $p$ . Second, it turns out that $m=\Theta_{r}(\max\{\mu^{1/r},\mu/n^{r-1}\})$ appropriately clustered222For example, complete bipartite graphs $K_{y,z}$ with suitable $y=\Theta_{r}(\min\{\mu^{1/r},n\})$ and $z=\Theta_{r}(\mu/y^{r})$ suffice: they contain $\geq z\binom{y}{r}=\Theta_{r}(zy^{r})=\Theta_{r}(\mu)$ stars and $yz=\Theta_{r}(\mu/y^{r-1})=\Theta_{r}(\max\{\mu^{1/r},\mu/n^{r-1}\})$ edges; see Lemma 14 for more details. edges $F$ suffice to create $2\mu$ copies of $K_{1,r}$ , which implies ${\mathbb{P}}(X\geq 2\mu)\geq{\mathbb{P}}(F\subseteq G_{n,p})\geq p^{m}=e^{m\log(1/p)}$ . Intuitively, Theorem 1 confirms that the more likely of these two mechanisms (the one with larger probability) controls the upper tail behaviour for constant $\varepsilon$ .

Our second result determines the correct dependence of the stars upper tail on $\varepsilon$ , up to constant factors in the exponent (this contrasts Theorem 1 above, where the implicit constants may depend on $\varepsilon$ ). In particular, (4) below solves Problem 6.1 of Janson and Ruciński [18] in the special case $H=K_{1,r}$ . For subgraph counts this is the first example where, for $p$ bounded away from one, the order of the large deviation rate function $-\log{\mathbb{P}}(X\geq(1+\varepsilon)\mu)$ is determined for $\varepsilon=\varepsilon(n)$ of form $\varepsilon\geq n^{-\alpha}$ (the assumption $\Phi(\varepsilon)\geq 1$ is natural, since it ensures that we are dealing with exponentially small probabilities).

Theorem 2 (Upper tail problem for $\varepsilon=\varepsilon(n)\geq n^{-\alpha}$ ).

Given $r\geq 2$ , let $X=X_{r,n,p}$ be the number of copies of $K_{1,r}$ in $G_{n,p}$ . Set $\mu:={\mathbb{E}}X$ , $\sigma^{2}:=\operatorname{Var}X$ , and $\varphi(x):=(1+x)\log(1+x)-x$ . Given $\xi\in(0,1)$ there is $\alpha=\alpha(r)>0$ such that, for $p\in(0,1-\xi]$ and $\varepsilon\geq n^{-\alpha}$ satisfying $\Phi(\varepsilon)\geq 1$ and $1\leq(1+\varepsilon)\mu\leq X_{r,n,1}$ , we have

[TABLE]

with

[TABLE]

Remark 3.

The variance satisfies $\sigma^{2}=\Theta_{r}((1-p)\mu(1+(np)^{r-1}))$ ; see, e.g., Lemma 3.5 in [15]. Furthermore, if $\mu^{1-1/r}\geq\log n$ holds, then in (5) we can replace $\varphi(\varepsilon)\mu^{2}/\sigma^{2}$ by $(\varepsilon\mu)^{2}/\sigma^{2}$ ; see Lemma 12.

Conjecture 4 (Correct upper tail behaviour).

Theorem 2 remains valid without the assumption $\varepsilon\geq n^{-\alpha}$ .

We now motivate the somewhat unusual form of the exponent in (4). First, normal approximation heuristics333The same normal heuristic suggests that in (3) we should perhaps have used $\mu^{2}/\sigma^{2}$ instead of $\mu$ , but it turns out that then the $\mu^{2}/\sigma^{2}$ term would only matter for $\Phi$ (i.e., determine the minimum) in a range of $p$ where $\mu^{2}/\sigma^{2}=\Theta_{r,\varepsilon}(\mu)$ holds. suggest that ${\mathbb{P}}(X\geq(1+\varepsilon)\mu)\approx e^{-\Theta_{r}((\varepsilon\mu)^{2}/\sigma^{2})}$ for very small $\varepsilon$ , and this sub-Gaussian tail is consistent with the $\varphi(\varepsilon)\mu^{2}/\sigma^{2}$ term in (5) since $\varphi(\varepsilon)=\Theta(\varepsilon^{2})$ as $\varepsilon\to 0$ (the function $\varphi$ is well-known from Chernoff bounds). Second, in $G_{n,p}$ we usually expect to have at least $(1-\varepsilon)\mu$ copies of $K_{1,r}$ , say, so enforcing $2\varepsilon\mu$ extra copies via $m=\Theta_{r}(\max\{(\varepsilon\mu)^{1/r},(\varepsilon\mu)/n^{r-1}\})$ appropriately clustered444As before, complete bipartite graphs $K_{y,z}$ with suitable $y=\Theta_{r}(\min\{(\varepsilon\mu)^{1/r},n\})$ and $z=\Theta_{r}(\varepsilon\mu/y^{r})$ suffice (see Lemma 14). edges $F$ should thus be enough to give a total of $(1+\varepsilon)\mu$ copies of $K_{1,r}$ ; this heuristic loosely suggests ${\mathbb{P}}(X\geq(1+\varepsilon)\mu)\geq\Omega_{r}(1)\cdot{\mathbb{P}}(F\subseteq G_{n,p})\geq\Omega_{r}(p^{m})\geq e^{-O_{r}(m\log(1/p))}$ . Intuitively, Conjecture 4 predicts that the form of the upper tail is indeed determined by either sub-Gaussian or ‘clustered’ behaviour, and Theorem 2 confirms this for $\varepsilon=\varepsilon(n)\geq n^{-\alpha}$ .

Our third result approaches the upper tail problem from a conceptually slightly different perspective, studying ${\mathbb{P}}(X\geq\mu+t)$ for general deviations $t$ (this contrasts Theorem 1 and 2 above, where we focus on the large deviations range $t=\varepsilon\mu$ and then put restrictions on $\varepsilon$ ). For subgraph counts, inequality (6) below is the first example where, for moderately large edge-probabilities $p$ , the order of $-\log{\mathbb{P}}(X\geq\mu+t)$ is completely resolved for all exponentially small deviations (where $t\geq\sigma$ is the natural target assumption). We complement this result with inequality (7) below, which is the first example where the order of $-\log{\mathbb{P}}(X\geq\mu+t)$ is resolved for nearly all deviations $t$ where the ‘clustered’ behaviour determines the exponent (here $t^{2}/\sigma^{2}\geq M(t)\log(e/p)$ is the natural target assumption for $\mu^{1-1/r}\geq\log n$ ; see (5), Remark 3, and Conjecture 4).

Theorem 5 (General upper tail bounds: moderate deviations and clustered regime).

Given $r\geq 2$ , let $X=X_{r,n,p}$ be the number of copies of $K_{1,r}$ in $G_{n,p}$ . Set $\mu:={\mathbb{E}}X$ and $\sigma^{2}:=\operatorname{Var}X$ . Given $\xi\in(0,1)$ , then the following holds whenever $p\in(0,1-\xi]$ and $1\leq\mu+t\leq X_{r,n,1}$ .

(i)

If $p\geq(\log n)/n$ and $t\geq\sigma$ , then

[TABLE] 2. (ii)

If $\mu\geq\xi$ and $t>0$ satisfies $t^{2}/\sigma^{2}\geq M(t)\log(e/p)\cdot(\log n)^{2r}$ , then

[TABLE]

By Remark 3, inequalities (6)–(7) provide further evidence for Conjecture 4 (and verify it for $p\geq(\log n)/n$ ).

1.2 Some comments

The main focus of this paper are upper bounds on the upper tail ${\mathbb{P}}(X\geq(1+\varepsilon)\mu)$ . Developing [31, 33], here our high-level proof strategy is based on the idea that (after ignoring certain ‘bad’ events with negligible probabilities) using combinatorial arguments we can find a ‘well-behaved’ subgraph $G_{0}\subseteq G_{n,p}$ in the sense that (i) the number of stars $K_{1,r}$ in $G_{0}$ and $G_{n,p}$ are approximately the same (differ by at most $\varepsilon\mu/2$ , say), and (ii) the maximum degree of $G_{J}$ is ‘not too large’ (which intuitively helps for showing concentration of stars). Using modern Chernoff-like upper tail bounds, we then show that it is very unlikely to have a ‘bad’ subgraph $G^{\prime}\subseteq G_{n,p}$ with ‘not too large’ maximum degree and ‘many’ stars (at least $(1+\varepsilon/2)\mu$ many, say). Putting things together, the punchline is then that we can only have $X\geq(1+\varepsilon)\mu$ many stars if one of the discussed unlikely ‘bad’ events occur, which (after some technical work) eventually gives the desired upper bounds on the upper tail ${\mathbb{P}}(X\geq(1+\varepsilon)\mu)$ ; see Section 2 for more details.

Finally, let us briefly compare our upper tail results for stars with very recent results from the large deviation theory literature, which are spearheaded by Chatterjee, Dembo, Lubetzky, Varadhan, Zhao, and many others (see, e.g., [5, 22, 4, 23, 10, 1, 8]). For general $H$ , these aim at determining the asymptotics of $-\log{\mathbb{P}}(X_{H}\geq(1+\varepsilon){\mathbb{E}}X_{H})$ for constant $\varepsilon$ and large edge-probabilities of form $p=\Theta(1)$ or $p\geq n^{-\delta_{H}}$ . For stars $H=K_{1,r}$ , inequality (4) from Theorem 2 is weaker in the sense that it only determines $-\log{\mathbb{P}}(X_{K_{1,r}}\geq(1+\varepsilon){\mathbb{E}}X_{K_{1,r}})$ up to constant factors, but it is stronger in the sense that it covers a much wider range of the parameters, including $\varepsilon=\varepsilon(n)\geq n^{-\alpha}$ and all $p=p(n)$ of interest. Obtaining such tail estimates with increased ranges of applicability is useful for combinatorial applications, where one is usually ‘willing to give up a little bit on the tail’, in particular on the ‘inessential numerical constants’ in the exponent (see [30, 18]). Furthermore, estimates of form (6)–(7) are also quite satisfactory from a concentration inequality perspective. Overall, we hope that our results stimulate more research into such estimates for other graphs $H$ .

1.3 Organization

In Section 2 we prove the upper bounds on the upper tail from Theorem 1, 2, and 5 (and discuss a simple extension). The corresponding (fairly routine) lower bounds are then established in Appendix A.

2 Upper bounds on the upper tail

In this section we establish the upper bounds on the upper tail ${\mathbb{P}}(X\geq(1+\varepsilon)\mu)$ from Theorem 1, 2, and 5. Our core argument has two strands. In the first combinatorial part we iteratively decrease the maximum degree of the random graph $G_{n,p}=G_{J}\supseteq\cdots\supseteq G_{0}$ by edge-deletion (the idea is to remove large stars $K_{1,D_{j}}$ with $D_{j}\gg r$ from $G_{j}$ ) until the final graph $G_{0}$ has sufficiently low maximum degree, say at most $D$ . This degree bound allows us to estimate the number of stars $K_{1,r}$ in $G_{0}$ via a ‘well-behaved’ auxiliary random variable $X_{D}$ . Taking into account the number of stars $K_{r}$ which are removed when passing from $G_{n,p}=G_{J}$ to $G_{0}$ , this allows us (by means of a technical event ${\mathcal{T}}$ ) to approximate the number $X=X_{r,n,p}$ of copies of $K_{1,r}$ in $G_{n,p}$ using $X_{D}$ and several further auxiliary random variables $N_{D_{j}}$ (which intuitively bound the number of $K_{1,D_{j}}$ in $G_{n,p}$ ). In the second probabilistic part we then estimate the upper tails of these auxiliary variables using a concentration inequality of Warnke [31] and ad-hoc arguments (exploiting the careful definitions of the variables $X_{D}$ and $N_{D_{j}}$ given in Section 2.1). Putting things together, the core argument then proceeds roughly as follows: by the combinatorial part $X\geq(1+\varepsilon)\mu$ can only happen if at least one of the auxiliary variables $X_{D}$ or $N_{D_{j}}$ is ‘large’, and by the probabilistic part the probability of this ‘bad’ event is at most the desired ‘correct’ upper tail probability (for suitable choices of the degree constraint $D$ and other parameters).

In Section 2.1 we first illustrate this argument for the simpler setup of Theorem 1, and in Section 2.2 we then extend the argument to the more precise tail estimates of Theorem 2 and 5. Finally, in Section 2.3 we also briefly discuss a straightforward extension (to a certain sum of iid variables).

2.1 Core argument for Theorem 1

We start by introducing the main random variables and events for Theorem 1 (as we shall see, their careful definitions will facilitate the interplay between the combinatorial and probabilistic parts of our argument). For $x\geq 0$ , let $X_{x}$ denote the maximum number of copies of $K_{1,r}$ in any subgraph $H\subseteq G_{n,p}$ with maximum degree at most $x$ . For $y>0$ , let $N_{y}$ denote the maximum size of any collection of edge-disjoint $K_{1,\lceil{y}\rceil}$ in $G_{n,p}$ . For $\beta,D,t>0$ let ${\mathcal{T}}={\mathcal{T}}(\beta,D,t)$ denote the ‘technical’ event that

[TABLE]

where we tacitly used the following convenient parametrization:

[TABLE]

(In this subsection we shall only use $t=\varepsilon\mu$ ; working with general $t$ is convenient for the later extensions.)

The following combinatorial lemma is at the heart of our argument, and it intuitively states that $X\approx X_{D}$ whenever the event ${\mathcal{T}}={\mathcal{T}}(\beta,D,t)$ holds. Its proof is inspired by ideas developed in [31, 33], but contains several new ideas. For example, instead of iteratively sparsifying an auxiliary hypergraph (which encodes the edge-sets of all stars $K_{1,r}$ in $G_{n,p}$ ) we here iteratively sparsify the random graph $G_{n,p}$ itself. Furthermore, in order to obtain the correct tail behaviour, in inequality (8) we need to work with $M=\max\{t^{1/r},t/n^{r-1}\}$ instead of the simpler choice $M=t^{1/r}$ suggested by [31] (we achieve this by adding an extra degree bound to the argument, bounding the initial maximum degree by ${\overline{M}}=\min\{M,n\}$ instead of just $M$ ).

Lemma 6.

Given $\beta\in(0,1/32]$ and $D,t>0$ , the event ${\mathcal{T}}(\beta,D,t)$ implies $X_{D}\leq X\leq X_{D}+t/2$ .

The lower bound $X\geq X_{D}$ of Lemma 6 is trivial. For the upper bound the idea is to iteratively decrease the maximum degree of $G_{n,p}$ , yielding $G_{n,p}=G_{J}\supseteq\cdots\supseteq G_{0}$ . By bounding the number of $K_{1,r}$ which are removed when passing from $G_{j+1}$ to $G_{j}$ , this eventually allows us to estimate the total number of $K_{1,r}$ .

Proof of Lemma 6.

Define ${\overline{M}}:=\min\{M,n\}$ . Let $J$ be the smallest integer $J\geq 0$ with $D_{J}\geq{\overline{M}}$ . We set $G_{J}=G_{n,p}$ and inductively construct $G_{J}\supseteq\cdots\supseteq G_{0}$ . Given $G_{j+1}$ , $0\leq j\leq J-1$ , let ${\mathcal{C}}_{j+1}$ be a maximal set of edge-disjoint collection of stars $K_{1,\lceil{D_{j}}\rceil}$ . We remove all edges from $G_{j+1}$ which are incident to a centre vertex of some star in ${\mathcal{C}}_{j+1}$ , and denote the resulting graph by $G_{j}$ .

Writing $\Delta_{j}=\Delta(G_{j})$ for the maximum degree of $G_{j}$ , we claim that $\Delta_{j}\leq D_{j}$ for all $0\leq j\leq J$ . For $G_{J}=G_{n,p}$ we use a case distinction. If $M\geq n$ , then trivially $\Delta_{J}\leq n={\overline{M}}\leq D_{J}$ . Otherwise $D_{J}\geq{\overline{M}}=M$ , in which case (8) entails $N_{D_{J}}<\beta<1$ , so $G_{n,p}=G_{J}$ contains no $K_{1,\lceil{D_{J}}\rceil}$ , and $\Delta_{J}\leq\lceil{D_{J}}\rceil-1\leq D_{J}$ follows. Further considering $G_{j+1}$ with $0\leq j\leq J-1$ , we note that $\Delta_{j}\leq\lceil{D_{j}}\rceil-1\leq D_{j}$ by construction, because otherwise we could add another $K_{1,\lceil{D_{j}}\rceil}$ to ${\mathcal{C}}_{j+1}$ (contradicting maximality).

With $G_{J}\supseteq\cdots\supseteq G_{0}$ in hand, we now count the total number of copies of $K_{1,r}$ in $G_{n,p}=G_{J}$ . Note that, given an edge $e=\{v_{1},v_{2}\}$ of $G_{j+1}$ with $0\leq j<J$ , we can construct any $K_{1,r}$ in $G_{j+1}$ containing $e$ by first selecting a centre vertex $v_{c}\in\{v_{1},v_{2}\}$ and then $r-1$ additional neighbours of $v_{c}$ . Hence in $G_{j+1}$ any edge is contained in at most $2\binom{\Delta_{j+1}}{r-1}\leq 2^{r}D_{j}^{r-1}/(r-1)!\leq 4D_{j}^{r-1}$ copies of $K_{1,r}$ . Recalling the definition of $N_{D_{j}}$ , note that when, passing from $G_{j+1}$ to $G_{j}$ , we remove at most $N_{D_{j}}\Delta_{j+1}\leq 2N_{D_{j}}D_{j}$ edges. So, since $G_{0}$ contains at most $X_{D_{0}}=X_{D}$ copies of $K_{1,r}$ , using (8) and $\max_{0\leq j<J}D_{j}\leq{\overline{M}}$ it follows that

[TABLE]

Recalling $D_{j}=2^{j}D$ and $r\geq 2$ , using ${\overline{M}}=\min\{M,n\}$ , $M=\max\{t^{1/r},\>t/n^{r-1}\}$ and $\beta\leq 1/32$ we infer

[TABLE]

which completes the proof. ∎

Applying Lemma 6 with $t=\varepsilon\mu$ , in the probabilistic part of the argument it remains to estimate ${\mathbb{P}}(X_{D}\geq\mu+\varepsilon\mu/2)$ and ${\mathbb{P}}(\neg{\mathcal{T}}(\beta,D,\varepsilon\mu))$ . We shall exploit the maximum degree constraint of $X_{D}$ via the following upper tail inequality of Warnke [31], which extends classical Chernoff bounds to random variables with ‘well-behaved dependencies’ (and allows us to go beyond the method of typical bounded differences [32]).

Theorem 7 (Corollary of [31, Theorem 9]).

Let $(\xi_{i})_{i\in{\mathfrak{S}}}$ be a finite family of independent random variables with $\xi_{i}\in\{0,1\}$ . Given a family ${\mathcal{I}}$ of subsets of ${\mathfrak{S}}$ , consider random variables $Y_{\alpha}:=\prod_{i\in\alpha}\xi_{i}$ with $\alpha\in{\mathcal{I}}$ , and suppose $\sum_{\alpha\in{\mathcal{I}}}{\mathbb{E}}Y_{\alpha}\leq\mu$ . Define $Z_{C}:=\max\sum_{\alpha\in{\mathcal{J}}}Y_{\alpha}$ , where the maximum is taken over all ${\mathcal{J}}\subseteq{\mathcal{I}}$ with $\max_{\beta\in{\mathcal{J}}}|\{\alpha\in{\mathcal{J}}:\alpha\cap\beta\neq\varnothing\}|\leq C$ . Set $\varphi(x):=(1+x)\log(1+x)-x$ . Then, for all $C,t>0$ ,

[TABLE]

The main observation is that, in every subgraph $H\subseteq G_{n,p}$ with maximum degree at most $D$ , any star $K_{1,r}$ shares edges with $O(D^{r-1})$ other stars. For $X_{D}$ this allows us to routinely apply Theorem 7 with Lipschitz-like parameter $C=O(D^{r-1})$ , making inequality (13) plausible. For Theorem 1 the crux is that our choice of $D$ will ensure $\mu/D^{r-1}=\Theta_{r}(\Phi)$ , so (13) suggests that $X_{D}\leq\mu+\varepsilon\mu/2$ fails with probability at most $e^{-\Omega_{r,\varepsilon}(\Phi)}$ .

Corollary 8.

For all $n\geq 1$ , $p\in(0,1]$ and $D,t>0$ we have

[TABLE]

Proof.

Let ${\mathcal{K}}_{1,r}(G)$ contain all edge-subsets of $G$ that are isomorphic to $K_{1,r}$ . Writing $Y_{\alpha}:=\mathbbm{1}_{\{{\alpha\subseteq E(G_{n,p})}\}}$ , there is a subgraph $H\subseteq G_{n,p}$ with maximum degree at most $\lfloor{D}\rfloor$ such that $X_{D}=\sum_{\alpha\in{\mathcal{J}}}Y_{\alpha}$ for ${\mathcal{J}}:={\mathcal{K}}_{1,r}(H)$ . Given $\beta\in{\mathcal{J}}$ , we construct all edge-intersecting stars $\alpha\in{\mathcal{J}}$ as in the proof of Lemma 6, and infer

[TABLE]

It follows that $X_{D}\leq Z_{C}$ , where $Z_{C}$ is defined as in Theorem 7 with ${\mathcal{I}}={\mathcal{K}}_{1,r}(K_{n})$ . It is well-known (and easy to check by calculus) that for $x\geq 0$ we have

[TABLE]

Putting things together, using Theorem 7 and (15) it follows that

[TABLE]

which completes the proof of (13) by choice of $C$ (see (14) above). ∎

We shall estimate ${\mathbb{P}}(\neg{\mathcal{T}}(\beta,D,\varepsilon\mu))$ via a union bound argument and the following upper tail estimate for $N_{D_{j}}$ . The technical assumption (17) intuitively ensures that vertices with degree at least $D$ are unlikely. For Theorem 1 the crux is that our choice of $D$ will also ensure $np/(eD_{j})\leq p^{\Omega(1)}$ , so applications of inequality (18) with $x=\beta M/D_{j}$ suggest that ${\mathcal{T}}$ and thus (8) fails with probability at most $n\cdot n^{-3}p^{\Omega(M)}\leq n^{-2}p^{\Omega_{\varepsilon}(\Phi)}$ , say.

Lemma 9.

For all $n\geq 1$ , $p\in(0,1]$ , and $D>0$ satisfying

[TABLE]

the following holds. For all $x>0$ we have

[TABLE]

Proof.

As $\binom{m}{z}\leq(me/z)^{z}$ for all integers $m,z\geq 1$ , by exploiting the disjointness condition of $N_{D_{j}}$ we infer

[TABLE]

As the function $x\mapsto(e^{3}np/x)^{x}$ is decreasing for $x\geq e^{2}np$ , and (17) implies $\lceil{D_{j}}\rceil\geq D\geq e^{3}np$ , we deduce

[TABLE]

Plugging this into (19) readily establishes inequality (18), since trivially $N_{D_{j}}=0$ when $D_{j}>n$ . ∎

For the proof of the upper bound of Theorem 1 it remains to pick suitable $D$ , i.e., which satisfies the technical assumption (17) and yields the ‘correct’ exponent in (13) and suitable applications of (18).

Proof of the upper bound in (3) of Theorem 1.

For concreteness, define $\beta:=1/32$ and $\gamma:=1/(16r)$ , as well as

[TABLE]

For later reference, we record that there is a constant $d=d(r)>0$ such that, for $n\geq n_{0}(r)$ ,

[TABLE]

By Lemma 6, the upper tail of the number $X=X_{r,n,p}$ of $K_{1,r}$ -copies satisfies

[TABLE]

Gearing up to bound ${\mathbb{P}}(\neg{\mathcal{T}}(\beta,D,\varepsilon\mu))$ via Lemma 9, using $e=p^{\gamma}e^{s}$ and inequality (20) together with the bound $s^{1/(r-1)}\leq s=1+\log p^{-\gamma}\leq p^{-\gamma}$ (as $1+x\leq e^{x}$ ) it follows that

[TABLE]

where here and below we shall always tacitly assume $n\geq n_{0}(r,d)$ whenever necessary. Since the above calculation also gives $D\geq Anp^{1-\gamma}$ , together with $D\geq A$ it follows that

[TABLE]

Applying a union bound argument, using estimates (18), $D_{j}=2^{j}D\geq D$ , and (22) it follows that

[TABLE]

Recalling (21) and the definition of $M=M(\varepsilon\mu)$ , by applying Corollary 8 with $t:=\varepsilon\mu$ it follows that there is a constant $c=c(\beta,A,\gamma,r)>0$ and suitable parameters $\zeta,\Pi>0$ such that

[TABLE]

We find the above upper tail estimate very satisfactory, but in the literature it has become standard to suppress multiplicative factors such as $1+n^{-2}$ in (24), which is straightforward when $c\zeta\Pi\geq 1$ holds (rescaling the exponent $c\zeta\Pi$ by a factor of $1/2$ , say). In the remaining case $1>c\zeta\Pi$ Markov’s inequality gives

[TABLE]

Finally, noting $s=\log(e/p^{\gamma})\geq\log(1/p^{\gamma})=\gamma\log(1/p)$ then establishes the upper bound in (3). ∎

2.2 Extension of the argument to Theorem 2 and 5

We now extend the arguments from Section 2.1 to the upper bounds of Theorem 2 and 5. To obtain sub-Gausssian decay $\varphi(\varepsilon)\mu^{2}/\sigma^{2}$ in the exponent of tail-inequality (13) for $X_{D}$ , in view of the well-known variance estimate $\sigma^{2}=\Theta_{r,\xi}((1+(np)^{r-1})\mu)$ from Remark 3 we here would like to pick $D=\Theta_{r,\xi}(1+np)$ for some range of $t=\varepsilon\mu$ . However this choice causes a major problem:555For $D=\Theta_{r,\xi}(1+np)$ another problem is that the technical assumption (17) of Lemma 9 then breaks when $np$ is close to one, which partially explains why in the upcoming Theorem 11 we shall exclude fairly small $t$ when $np\in(n^{-\gamma},\gamma\log n)$ . in the key estimate (22) we can no longer win an extra log-factor (via $np/(eD)\leq e^{-s}$ ) when we bound the $N_{D_{j}}$ variables using (18) from Lemma 9. Our strategy for overcoming this obstacle is to refine the technical event ${\mathcal{T}}={\mathcal{T}}(\beta,D,t)$ , by enforcing different upper bounds on $N_{D_{j}}$ when $D_{j}=2^{j}D$ is small (so that in the probabilistic arguments we automatically win an extra logarithmic factor, without destroying the combinatorial counting arguments from Lemma 6).

Turning to the details, for $\gamma,\beta,D,t>0$ let ${\mathcal{T}^{+}}={\mathcal{T}^{+}}(\gamma,\beta,D,t)$ denote the ‘technical’ event that

[TABLE]

where, in addition to the parameters $M=\max\{t^{1/r},t/n^{r-1}\}$ and $D_{j}=2^{j}D$ from (9), we tacitly used

[TABLE]

Lemma 10.

Given $\beta\in(0,1/64]$ and $\gamma,D,t>0$ , the event ${\mathcal{T}^{+}}(\gamma,\beta,D,t)$ implies $X_{D}\leq X\leq X_{D}+t/2$ .

Proof.

The proof of Lemma 6 carries over, except for the final inequalities (10)–(11) that bound $X$ from above. Recalling that ${\overline{M}}=\min\{M,n\}$ , by mimicking the argument leading to (10) we here obtain

[TABLE]

Recalling $D_{j}=2^{j}D$ and $r\geq 2$ , using $\beta\leq 1/64$ it then follows similarly to (10)–(11) that

[TABLE]

which completes the proof. ∎

We are now ready to prove the following slightly more general upper tail estimate for the number $X=X_{r,n,p}$ of $K_{1,r}$ -copies in $G_{n,p}$ , which (as we shall see) implies the upper bounds in Theorems 2 and 5.

Theorem 11 (Upper tail bounds: technical result).

Given $r\geq 2$ , let $X=X_{r,n,p}$ . Set $\mu:={\mathbb{E}}X$ , $\Lambda:=\mu(1+(np)^{r-1})$ , and $\varphi(x):=(1+x)\log(1+x)-x$ . Given $\gamma>0$ , suppose that either

[TABLE]

holds, where the parameters $M$ and $s$ are defined in (9) and (27). Then we have

[TABLE]

Proof.

Let $\beta:=1/64$ . We distinguish the following three cases: (i) $np\geq\gamma\log n$ , (ii) $np\leq n^{-\gamma}$ , and (iii) $t^{2}/\mu\geq\mathbbm{1}_{\{{t\leq\min\{\mu,n^{r}\}}\}}\gamma\min\{t^{1/r}(\log n)^{r},Ms(\log n)^{r-1}\}$ . Note that in all three cases we may assume $\gamma\leq 1/(16r)$ , since decreasing $\gamma$ yields a less restrictive assumption. Furthermore, in case (iii) we may also assume that $n^{-\gamma}\leq np\leq\log n$ holds (otherwise case (i) or (ii) apply). For concreteness, define

[TABLE]

(We remark that in cases (i)–(ii) the simpler choice $D=A(1+np)$ suffices.) We defer the somewhat technical proofs of the following claims regarding Lemma 9: (a) assumption (17) holds, and (b) inequality (18) implies

[TABLE]

where here and below we shall again tacitly assume $n\geq n_{0}(r)$ . Analogously to inequalities (21) and (24), by first applying Lemma 10 and Corollary 8, and then using $(1+np)^{r-1}=\Theta_{r}(\Lambda/\mu)$ , it follows that

[TABLE]

Since $s=\log(e/p^{\gamma})\geq\gamma\log(e/p)$ , this establishes inequality (28).

It remains to verify claims (a) and (b) above, and start with claim (a), i.e., that the assumption (17) of Lemma 9 holds. Note that $D\geq A(1+np)\geq e^{4}np$ . Furthermore, in case (i) we have $D\geq A\gamma\log n$ , and in case (ii) we have $np\leq n^{-\gamma}$ and $D\geq A$ . So, in both cases, using $A\geq 8/\gamma$ we infer

[TABLE]

Proceeding analogously, in the cumbersome case (iii) it suffices to show $D\geq 8\log n$ . Using $\gamma\leq 1/(16r)$ , $p\geq n^{-1-\gamma}$ and (20), it is routine to see that $s\leq\log n$ and $\mu\geq n^{1/2}$ . Assuming $t\geq\mu$ , by first using (15) and then distinguishing the cases $t\geq n^{r}$ (where $M=t/n^{r-1}$ ) and $t\leq n^{r}$ (where $M=t^{1/r}$ ), it follows that

[TABLE]

Assuming $t\leq\mu$ , we note that assumption $p\leq(\log n)/n$ implies $\mu\leq n^{r}$ (hence $t\leq n^{r}$ and thus $M=t^{1/r}$ , as noted above). Hence, by first using (15) and then the assumed lower bound on $t$ from case (iii), we infer

[TABLE]

Each time $D\geq 8\log n$ follows readily by definition of $A$ , establishing claim (a), as discussed above.

Finally, we verify claim (b), i.e., that inequality (18) implies estimate (29). We start by observing that if ${\mathcal{T}^{+}}(\beta,\gamma,D,t)$ fails then a fortiori $N_{D_{0}}\geq 1$ . Hence, using (18) with $x=1$ and $D_{0}=D\geq e^{3}np$ , we deduce

[TABLE]

Analogously to (23), using inequality (18) and $D_{j}=2^{j}D\geq e^{3}np$ it also follows that

[TABLE]

We now use a fairly technical case distinction to verify that the two estimates (32)–(33) together imply (29). Assuming ${\overline{M}}\geq np^{1-2\gamma}$ , analogously to the proof of (22) we have $nps/{\overline{M}}\leq p^{2\gamma}s\leq p^{\gamma}=e^{1-s}$ , so that

[TABLE]

Next we assume $p\leq n^{-1/(1-\gamma)}$ , which implies $np/e\leq p^{\gamma}/e=e^{-s}$ , so that

[TABLE]

In the remaining case ${\overline{M}}<np^{1-2\gamma}$ and $p\geq n^{-1/(1-\gamma)}$ hold. Since ${\overline{M}}<n$ implies ${\overline{M}}=M$ , we infer $t\leq M^{r}=({\overline{M}})^{r}\leq n^{r}p^{r-2r\gamma}$ . So, recalling that $\Psi\leq t^{2}/\Lambda\leq t^{2}/[(np)^{r-1}\mu]$ by (15) and that $\mu\geq dn^{r+1}p^{r}$ by (20), using $D\geq np$ , $p\geq n^{-1/(1-\gamma)}$ and $\gamma\leq 1/(16r)$ we deduce that

[TABLE]

establishing $D\geq\Psi$ . It follows that

[TABLE]

which together with inequalities (32)–(35) implies the claimed estimate (29). ∎

We now deduce the upper bounds of Theorem 2 and 5 from the upper tail inequality (28).

Proof of the upper bound in (4) of Theorem 2.

Let $\gamma:=1/(9r)$ . For $t:=\varepsilon\mu\geq n^{-\alpha}\mu$ and $n\geq n_{0}(r)$ it is routine to check that $t^{2-1/r}/\mu\geq\mathbbm{1}_{\{{np\geq n^{-\gamma}}\}}\gamma(\log n)^{r}$ holds for $\alpha=\alpha(r)>0$ sufficiently small. Hence Theorem 11 applies with $t=\varepsilon\mu$ , where $\Lambda=\Theta_{r,\xi}(\sigma^{2})$ by Remark 3. Using $\Phi(\varepsilon)\geq 1$ it follows that

[TABLE]

establishing the upper bound in (4). ∎

For Theorem 5 we shall simplify the form of the exponent in (28) via the following auxiliary result, writing $a_{n}\asymp b_{n}$ instead of $a_{n}=\Theta(b_{n})$ for typographic reasons (the assumption $p\geq n^{-9}$ in (ii) is ad-hoc).

Lemma 12.

Given $\xi\in(0,1)$ , the following holds whenever $p\in(0,1-\xi]$ .

(i)

If $t\leq\mu$ , then

[TABLE] 2. (ii)

If $t\geq\mu$ and $t^{1-1/r}\geq(\log n)\mathbbm{1}_{\{{p<1/n}\}}$ , then $p\geq n^{-9}$ implies

[TABLE] 3. (iii)

If $t^{2}/\sigma^{2}\geq\min\{M,1\}$ and $\mu+t\geq 1$ , then $t=\Omega_{r,\xi}(1)$ .

Proof.

Inequality (38) and the first two estimates of equation (39) follow immediately from (15) and $\Lambda=\Theta_{r,\xi}(\sigma^{2})$ , see Remark 3. We now turn to the final inequality of equation (39). By combining (15) and $\Lambda/\mu=1+(np)^{r-1}$ with $M=\max\{t^{1/r},t/n^{r-1}\}$ and $t^{1-1/r}\geq(\log n)\mathbbm{1}_{\{{p<1/n}\}}+\mu^{1-1/r}\mathbbm{1}_{\{{p\geq 1/n}\}}$ , using $p\geq n^{-9}$ and $\mu^{1-1/r}=\Omega_{r}(n^{1/r}(np)^{r-1})$ , see (20), it follows similarly to (31) that

[TABLE]

where we exploited that calculus gives $p^{r-1}\log(e/p)=O_{r}(1)$ ; this completes the proof of claims (i)–(ii).

For claim (iii) we may of course assume $t\leq 1/2$ (otherwise there is nothing to show). Hence $t^{2}/\sigma^{2}\geq\min\{M,1\}\geq\min\{t^{1/r},1\}=t^{1/r}$ implies $t^{2-1/r}\geq\sigma^{2}=\Omega_{r,\xi}(\mu)$ by Remark 3, which in turn gives $t=\Omega_{r,\xi}(1)$ , because $t+\mu\geq 1$ and $t\leq 1/2$ together imply $\mu\geq 1-t\geq 1/2$ , completing the proof. ∎

Proof of the upper bound in (6) of Theorem 5.

Applying Theorem 11 (with $\gamma=1$ ), using (i)–(ii) of Lemma 12 it follows that inequality (28) holds with $\Omega_{r,\xi}(\Psi(t))$ in the exponent, where $\Psi(t)\geq\min\{1,t^{1/r}\}=\Omega_{r,\xi}(1)$ by (iii) of Lemma 12. Absorbing the $1+n^{-1}$ factor similar to (37) then establishes the upper bound in (6). ∎

Proof of the upper bound in (7) of Theorem 5.

Since $\sigma^{2}=\Omega_{r,\xi}(\mu)$ by Remark 3, note that the assumption

[TABLE]

implies $t^{2}/\mu\geq M\log(e/p)\cdot(\log n)^{r-1}$ , so that Theorem 11 (with $\gamma=1$ ) applies. Using (40), by (iii) of Lemma 12 we also infer that $M\geq t^{1/r}=\Omega_{r,\xi}(1)$ . Absorbing the $1+n^{-1}$ factor as before, it remains to show that the exponent of inequality (28) is $\Omega_{r,\xi}(M\log(e/p))$ . For $t\leq\mu$ this follows from (38) of Lemma 12 and (40). For $t\geq\mu$ this follows from (39) of Lemma 12, since (40) and $p<n^{-1}$ imply $t^{2}/(\log n)^{2r+1}\geq\sigma^{2}M=\Omega_{r,\xi}(\mu)=\Omega_{r,\xi}(1)$ and thus $t^{1-1/r}\geq(\log n)\mathbbm{1}_{\{{p<1/n}\}}$ , as required. ∎

2.3 Straightforward extension to a certain sum of iid variables

We close this section by recording that minor (and in fact simpler) variants of our proofs also apply to the following sum of independent random variables:

[TABLE]

Indeed, in view of the structural similarities to the number of $r$ -armed stars in $G_{n,p}$ (which satisfies $X_{n,r,p}=\sum_{v\in[n]}\binom{\mathrm{d}_{v}}{r}$ , writing $\mathrm{d}_{v}$ for the degree of $v$ ), here we set $X_{x}:=\sum_{i\in[n]:Y_{i}\leq\lfloor{x}\rfloor}\binom{Y_{i}}{r}$ , and define $N_{x}$ as the number of $i\in[n]$ with $Y_{i}\geq\lceil{x}\rceil$ . Now the proofs of Lemma 6 and 10 carry over with minor changes: exploiting that there are no dependencies between the $Y_{i}$ , using a simple dyadic decomposition we here obtain

[TABLE]

For the proof of Corollary 8 it suffices to show that $X_{D}\leq Z_{C}$ holds in the present setting. Since $Y_{i}$ is a sum of $n$ independent indicators $\xi_{i,j}$ , we may write each $\binom{Y_{i}}{r}$ as a sum of $\binom{n}{r}$ dependent indicators (which each are products of some $r$ distinct independent variables $\xi_{i,j}$ ). Using the constraint $Y_{i}\leq\lfloor{D}\rfloor$ the analogous left hand side of (14) is thus bounded by $r\cdot\binom{\lfloor{D}\rfloor}{r-1}\leq 2D^{r-1}$ , which in turn implies $X_{D}\leq Z_{C}$ , as desired. Since the proof of Lemma 9 also remains valid (as inequality (19) carries over), we thus arrive at the following result.

Theorem 13 (Upper tail bounds: an extension).

The upper bounds on the upper tail ${\mathbb{P}}(X\geq(1+\varepsilon)\mu)$ from Theorem 1, 2, 5, and 11 remain valid for the random variable $X$ defined in (41).

Perhaps surprisingly, we are not aware of any standard method or inequality (for sums of iid variables) which can routinely recover the upper tail bounds from Theorem 13. Here one technical difficulty seems to be that each summand $\binom{Y_{i}}{r}$ has an upper tail that decays slower than exponentially (for $r\geq 2$ ), which presumably is closely linked to the somewhat non-standard $\log(1/p)$ term in the exponent.

Acknowledgements. We would like to thank Svante Janson for a helpful discussion, and the CPC referee report from June 2015 (on an earlier version of this paper) for suggestions concerning the presentation.

Appendix A Appendix: Lower bounds on the upper tail

In this appendix we establish fairly routine lower bounds on the upper tail ${\mathbb{P}}(X\geq(1+\varepsilon)\mu)$ from Theorem 1, 2, and 5 (omitting some straightforward details). Following [31] we obtain our lower bounds via the following three events: that many copies of $K_{1,r}$ ‘cluster’ on few edges (see Lemma 14 and 16), that most copies of $K_{1,r}$ arise disjointly (see Lemma 15 and 17), and that $G_{n,p}$ contains more edges than expected (see Lemma 18).

A.1 Basic argument for Theorem 1

For Theorem 1 we shall use two different lower bounds, and the first one is based on the idea that relatively few edges (which ‘cluster’ in an appropriate way) can create fairly many stars $K_{1,r}$ . This is formalized by the following result, which implies ${\mathbb{P}}(X_{r,n,p}\geq x)\geq{\mathbb{P}}(F\subseteq G_{n,p})=p^{|E(F)|}$ since $F\subseteq G_{n,p}$ enforces $X_{r,n,p}\geq x$ .

Lemma 14 (Clustering).

For every $r\geq 1$ there is $D=D(r)>0$ so that for all $n\geq 1$ and $0<x\leq X_{r,n,1}$ there is $F\subseteq K_{n}$ with $|E(F)|\leq D\max\{x^{1/r},x/n^{r-1},1\}$ edges such that $F$ contains at least $x$ copies of $K_{1,r}$ .

Inspired by the proofs of Theorem 1.3 and 1.5 in [17], the idea is to use a complete bipartite graph $F=K_{y,z}$ with $y=\Theta_{r}(\min\{x^{1/r},n\})$ and $z=\Theta_{r}(x/y^{r})$ , which contains $yz=\Theta_{r}(x/y^{r-1})=O_{r}(\max\{x^{1/r},x/n^{r-1}\})$ edges and at least $z\binom{y}{r}=\Theta_{r}(zy^{r})=\Omega_{r}(x)$ copies of $K_{1,r}$ (certain border cases require minor care).

Proof of Lemma 14.

Let $x_{0}:=2(4r)^{r}$ , $n_{0}:=(r+1)x_{0}$ , and $D:=n_{0}^{2}$ . If (i) $x_{0}\leq x\leq n^{r+1}/D$ and $n\geq n_{0}$ , then we let $F:=K_{y,z}$ , with $y:=\lceil{\min\{x^{1/r},n\}/4}\rceil$ and $z:=\lceil{r^{r}x/y^{r}}\rceil$ . Note that $F\subseteq K_{n}$ exists, since it is easy to check that $1<y\leq n/2$ and $1<z\leq n/2$ , say (we leave the details to the reader). Furthermore, $F$ contains at least $z\binom{y}{r}\geq z(y/r)^{r}\geq x$ many $K_{1,r}$ , and $|E(F)|=yz\leq 2r^{r}x/y^{r-1}\leq D\max\{x^{1/r},x/n^{r-1}\}$ edges.

If either (ii) $1\leq n<n_{0}$ or (iii) $x>n^{r+1}/D$ and $n\geq n_{0}$ , then we let $F:=K_{n}$ , which trivially contains $X_{r,n,1}\geq x$ copies of $K_{1,r}$ , and $|E(F)|<n^{2}<\max\{n_{0}^{2},Dx/n^{r-1}\}=D\max\{1,x/n^{r-1}\}$ edges.

Finally, if (iv) $x<x_{0}$ and $n\geq n_{0}$ , then we let $F:=K_{n_{0}}$ , which contains at least $n_{0}/(r+1)=x_{0}>x$ vertex disjoint copies of $K_{1,r}$ and $|E(F)|<n_{0}^{2}=D$ edges, completing the proof. ∎

The second lower bound is inspired by the fact that $X=X_{n,r,p}$ is approximately Poisson for small $p$ , in which case most $K_{1,r}$ arise disjointly. Indeed, the following standard result bounds ${\mathbb{P}}(X=m)$ from below by the probability that there are exactly $m$ vertex-disjoint copies of $K_{1,r}$ (see [7, 26, 31] for similar arguments), which for $m=(1+\varepsilon)\mu$ will imply ${\mathbb{P}}(X\geq m)\geq e^{-O_{r,\varepsilon}(m)}$ ; the precise form of (42) will be useful later on.

Lemma 15 (Disjoint approximation).

Given $r\geq 2$ there are $n_{0},b>0$ (depending only on $r$ ) such that, for all $n\geq n_{0}$ , $0<p\leq n^{-1-1/(r+1)}$ and integers $m\in\mathbb{N}$ satisfying $0\leq m\leq 99\max\{\mu,n^{1/(r+1)}\}$ , we have

[TABLE]

Proof.

Let ${\mathcal{K}}$ contain all copies of $K_{1,r}$ in $K_{n}$ . Define ${\mathfrak{S}}_{m}$ as the collection of all $m$ -element subsets of ${\mathcal{K}}$ in which all stars $K_{1,r}$ are vertex disjoint. Given ${\mathcal{C}}\subseteq{\mathfrak{S}}_{m}$ , define ${\mathcal{I}}_{{\mathcal{C}}}$ as the event that all stars $K_{1,r}$ of ${\mathcal{C}}$ are present, and define ${\mathcal{D}}_{{\mathcal{C}}}$ as the event that none of the stars $K_{1,r}$ in ${\mathcal{K}}\setminus{\mathcal{C}}$ are present. Note that

[TABLE]

Distinguishing the number of edges in which each star $\alpha\in{\mathcal{K}}\setminus{\mathcal{C}}$ overlaps with some star $K_{1,r}$ from the vertex-disjoint collection ${\mathcal{C}}\in{\mathfrak{S}}_{m}$ , using Harris inequality [12] and $np=o(1)$ we routinely obtain

[TABLE]

where $mnp=O(\max\{n^{r+2}p^{r+1},n^{1+1/(r+1)}p\})=O(1)$ . Furthermore, with $(z-y)^{y}/y!\leq\binom{z}{y}\leq z^{y}/y!$ , $1-x\geq e^{-2x}$ and $X_{n,r,1}=n\binom{n-1}{r}$ in mind, basic counting (and a short calculation) gives

[TABLE]

This completes the proof of (42) since $m^{2}=O(\max\{n^{2(r+1)}p^{2r},n^{2/(r+2)}\})=O(n^{2/(r+2)})=o(n)$ . ∎

Combining the above two results, we now prove the lower bound of Theorem 1.

Proof of the lower bound in (3) of Theorem 1.

We shall tacitly assume $n\geq n_{0}(r,\varepsilon)$ whenever necessary. Applying Lemma 14 with $x:=(1+\varepsilon)\mu$ , there is $F\subseteq K_{n}$ with $|E(F)|\leq O_{r,\varepsilon}(\max\{\mu^{1/r},\mu/n^{r-1}\})$ edges that contains at least $(1+\varepsilon)\mu$ copies of $K_{1,r}$ . If $\Phi=\max\{\mu^{1/r},\mu/n^{r-1}\}\log(1/p)$ , then it follows that

[TABLE]

Otherwise $\Phi=\mu$ , which by a short calculation implies $\mu\leq(\log n)^{3}$ , say (since $\mu\geq(\log n)^{3}$ implies $p=\Omega_{r}(n^{-1-1/r})\geq n^{-2}$ and thus $\max\{\mu^{1/r},\mu/n^{r-1}\}\log(1/p)=O_{r}(\max\{\mu^{1/r},\mu/n^{r-1}\}\log n)<\mu$ ). Applying Lemma 15 with $m:=\lceil{(1+\varepsilon)\mu}\rceil<n^{1/(r+1)}$ , using $\binom{z}{y}\geq(z/y)^{y}$ , $\mu=X_{n,r,1}p^{r}$ , $1-x\geq e^{-2x}$ and $m\geq(1+\varepsilon)\mu\geq 1$ it follows that

[TABLE]

establishing the lower bound in (3). ∎

A.2 Refined arguments for Theorem 2 and 5

For Theorem 2 and 5 we shall refine the previous two lower bounds, and also introduce a new third lower bound. Each time some care is needed to obtain the ‘correct’ dependence on $t=\varepsilon\mu$ in the exponent, and we start by refining the ‘clustering’ based lower bound from Lemma 14 and (43).

Lemma 16 (Refined clustering bound).

Given $r\geq 1$ and $\xi\in(0,1)$ there are $n_{0},c>0$ (depending only on $r,\xi$ ) such that, for all $n\geq n_{0}$ , $p\in(0,1-\xi]$ and $t\geq\sigma$ satisfying $1\leq\mu+t\leq X_{r,n,1}$ , we have

[TABLE]

In case of $p=o(1)$ the basic proof idea is to obtain $\mu+t$ copies of $K_{1,r}$ as follows: (i) we first use the clustering construction from Lemma 14 to plant $2t$ copies of $K_{1,r}$ , and (ii) then use Harris’ inequality and a one-sided Chebychev’s inequality to show that typically $\geq\mu-t$ of the remaining $\tilde{X}_{n,r,1}:=X_{n,r,1}-2t$ other copies of $K_{1,r}$ are present (the crux is that the expected number of such copies is $\tilde{X}_{n,r,1}p^{r}=\mu-o(t)$ , so having $\geq\mu-t$ of them intuitively seems likely). For the resulting lower bound step (i) with probability $p^{O_{r}(\max\{t^{1/r},t/n^{r-1}\})}$ thus ought to give the main contribution, making (45) plausible. For technical reasons, in the actual argument we have to plant $\min\{(\beta+1)t,\lceil{\mu+t}\rceil\}$ copies of $K_{1,r}$ for carefully chosen $\beta>0$ . By mimicking the proof of Theorem 21 in [31] we then easily arrive at (45) above; we leave the details to the reader.

We next refine the ‘disjoint approximation’ based lower bound used in Lemma 15 and (44) for small $p$ . The idea is that inequality (42) intuitively relates $X=X_{n,r,p}$ to a binomial random variable with mean $\mu=X_{n,r,1}\cdot p^{r}$ , which makes the following Chernoff-type bound for the upper tail plausible.

Lemma 17 (Disjoint approximation: Chernoff-type lower bound).

Given $r\geq 2$ there are $n_{0},c,d>0$ (depending only on $r$ ) such that, for all $n\geq n_{0}$ , $0<p\leq n^{-1-1/(r+1)}$ and $t>0$ satisfying $1\leq\mu+t\leq 9\max\{\mu,n^{1/(r+1)}\}$ , we have

[TABLE]

Noting the binomial-like form of inequality (42) it is routine to check that Lemma 15 indeed implies (46) above (e.g., by summing (42) as in the proof of Theorem 22 in [31]); we leave the details to the reader.

Our third lower bound for moderately large $p$ it is based on the idea that a deviation in the number of edges should typically entail a deviation in the number of $K_{1,r}$ copies (in concrete words: if $G_{n,p}$ has substantially more than $\binom{n}{2}p$ edges, then we expect to have more $K_{1,r}$ copies than on average).

Lemma 18 (Deviation in number of edges: sub-Gaussian type lower bound).

Given $r\geq 2$ and $\xi\in(0,1)$ there are $n_{0},\beta,c>0$ (depending only on $r,\xi$ ) such that, setting $\Lambda:=\mu(1+(np)^{r-1})$ , for all $n\geq n_{0}$ , $\xi n^{-1}\leq p\leq 1-\xi$ and $\sigma\leq t\leq\beta\mu$ we have

[TABLE]

Remark 19.

By Remark 3, in inequality (47) we have $\Lambda=\Theta_{r,\xi}(\sigma^{2})$ , where $\sigma^{2}=\operatorname{Var}X$ .

Setting $\varepsilon:=t/\mu$ , the basic proof idea is to (i) condition on having $|E(G_{n,p})|\geq(1+\varepsilon)\binom{n}{2}p$ edges, and (ii) then show that this conditioning converts $X\geq\mu+t=(1+\varepsilon)\mu$ into a typical event (the crux is that this conditioning drives up the expected value of $X=X_{n,r,p}$ ; to see this it might help to think of the uniform random graph $G_{n,m}$ with $m=(1+\varepsilon)\binom{n}{2}p$ edges). For the resulting lower bound the conditioning thus ought to give the main contribution, which by folklore results satisfies ${\mathbb{P}}(|E(G_{n,p})|\geq(1+\varepsilon)\binom{n}{2}p)=\exp\bigl{(}-\Theta_{\xi}(\varepsilon^{2}\binom{n}{2}p))\bigr{)}$ . This makes inequality (47) plausible, since $\varepsilon^{2}\binom{n}{2}p=\varepsilon^{2}\cdot\Theta_{r,\xi}(\mu^{2}/\Lambda)=\Theta_{r,\xi}(t^{2}/\Lambda)$ for the considered range of $p$ . A simple modification of the proof of Theorem 24 in [31] makes this idea rigorous and establishes (47) above; we leave the details to the reader (we mention in passing that a tilting argument also works here).

Stitching the above three results together, we now prove the lower bounds of Theorem 2 and 5.

Proof of the lower bound in (7) of Theorem 5.

By (iii) of Lemma 12 we infer that $M\geq t^{1/r}=\Omega_{r,\xi}(1)$ , which in turn implies $t^{2}/\sigma^{2}\geq M\log(e/p)\cdot(\log n)^{2r}\geq 1$ and thus $t\geq\sigma$ . Hence an application of Lemma 16 (see inequality (45)) establishes the lower bound in (7). ∎

Proof of the lower bound in (6) of Theorem 5.

We shall only assume $p\geq n^{-1}$ instead of $p\geq n^{-1}\log n$ . Applying Lemmas 16 and 18, and using Remark 19, it follows that there is $\beta=\beta(r,\xi)>0$ such that

[TABLE]

By a virtually identical calculation as in the proof of (39) from Lemma 12, for $t\geq\beta\mu$ it follows that $t^{2}/\sigma^{2}\geq\Omega_{r,\xi}(M\log(e/p))$ holds. After adjusting the implicit constants, it follows that we can remove the indicator in inequality (48), which in view of $\Psi(t)=\min\{t^{2}/\sigma^{2},M\log(e/p)\}$ establishes the lower bound in (6). ∎

Proof of the lower bound in (4) of Theorem 2.

Set $t:=\varepsilon\mu$ and $M:=\max\{t^{1/r},t/n^{r-1}\}$ , as usual. Using (15) we have $(\varepsilon\mu)^{2}/\sigma^{2}\geq\varphi(\varepsilon)\mu^{2}/\sigma^{2}\geq\Phi(\varepsilon)\geq 1$ by assumption, so $t\geq\sigma$ follows. In the following we shall distinguish the three cases (i) $n^{-1}\leq p\leq 1-\xi$ , (ii) $n^{-1-1/(r+1)}\leq p<n^{-1}$ , and (iii) $0<p<n^{-1-1/(1+r)}$ .

In cases (i)–(ii) note that, say, $\mu^{1-1/r}=\Omega_{r}(n^{1/3r})>\log n$ holds. Using (i)–(ii) of Lemma 12, it thus suffices to prove the lower bound of (4) with exponent $\Phi(\varepsilon)$ replaced by $\Psi(t)$ defined in (6). In case (i) this bound follows from the above proof (valid for $n^{-1}\leq p\leq 1-\xi$ ) of the lower bound in (6), and in case (ii) we shall now argue that this bound follows from inequality (45) of Lemma 16, by establishing that $t^{2}/\sigma^{2}=\Omega_{r,\xi}(M\log(e/p))$ holds. Indeed, since $p<n^{-1}$ and Remark 3 imply $\sigma^{2}=\Theta_{r}(\mu)$ , after recalling $\mu^{1-1/r}=\Omega_{r}(n^{1/3r})$ and $t=\varepsilon\mu\geq n^{-\alpha}\mu$ it then follows for $\alpha=\alpha(r)>0$ sufficiently small (say, $\alpha<1/6r$ ) that

[TABLE]

completing the proof in cases (i)–(ii).

In the remaining case (iii) Lemmas 16 and 17 imply that, for some constant $d=d(r)\in(0,1]$ , we have

[TABLE]

We claim that for $\mu+t>9\max\left\{\mu,n^{1/(r+1)}\right\}$ we have $\varphi(t/\mu)\mu=\Omega_{r}\left(M\log(e/p)\right)$ . Indeed, noting that $\varphi(x)\geq x(\log x)/2$ for $x\geq e^{2}\approx 7.4$ (which is easy to check by calculus), it follows that

[TABLE]

Furthermore, $\log(t/\mu)/\log(e/p)=\Omega_{r}(1)$ when $\mu\leq p$ , and $\log(t/\mu)/\log(e/p)=\Omega_{r}((\log n)^{-1})$ when $\mu>p$ . In each case the claimed inequality holds, which allows omitting the indicator in (50). Since $\mu=\Theta_{r}(\mu^{2}/\sigma^{2})$ by Remark 3, now ${\mathbb{P}}(X\geq\mu+t)\geq d\cdot e^{-\Theta_{r,\xi}(\Phi(\varepsilon))}$ follows, which in view of $\Phi(\varepsilon)\geq 1$ completes the proof. ∎

Bibliography33

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] B. Bhattacharya, S. Ganguly, E. Lubetzky, and Y. Zhao. Upper tails and independence polynomials in random graphs. Adv. Math. 319 (2017), 313–347.
2[2] B. Bollobás. Threshold functions for small subgraphs. Math. Proc. Cambridge Philos. Soc. 90 (1981), 197–206.
3[3] S. Chatterjee. The missing log in large deviations for triangle counts. Random Struct. Alg. 40 (2012), 437–451.
4[4] S. Chatterjee and A. Dembo. Nonlinear large deviations. Adv. Math. 299 (2016), 396–450.
5[5] S. Chatterjee and S.R.S. Varadhan. The large deviation principle for the Erdős-Rényi random graph. European J. Combin. 32 (2011), 1000–1017.
6[6] B. De Marco and J. Kahn. Upper tails for triangles. Random Struct. Alg. 40 (2012), 452–459.
7[7] B. De Marco and J. Kahn. Tight upper tail bounds for cliques. Random Struct. Alg. 41 (2012), 469–487.
8[8] A. Dembo and N. Cook. Large deviations of subgraph counts for sparse Erdős–Rényi graphs. Preprint (2018). ar Xiv:1809.11148 .

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Upper tail bounds for Stars

Abstract

1 Introduction

1.1 Main results

Theorem 1** (Upper tail problem for constant ε\varepsilonε).**

Theorem 2** (Upper tail problem for ε=ε(n)≥n−α\varepsilon=\varepsilon(n)\geq n^{-\alpha}ε=ε(n)≥n−α).**

Remark 3**.**

Conjecture 4** (Correct upper tail behaviour).**

Theorem 5** (General upper tail bounds: moderate deviations and clustered regime).**

1.2 Some comments

1.3 Organization

2 Upper bounds on the upper tail

2.1 Core argument for Theorem 1

Lemma 6**.**

Proof of Lemma 6.

Theorem 7** (Corollary of [31, Theorem 9]).**

Corollary 8**.**

Proof.

Lemma 9**.**

Proof.

Proof of the upper bound in (3) of Theorem 1.

2.2 Extension of the argument to Theorem 2 and 5

Lemma 10**.**

Proof.

Theorem 11** (Upper tail bounds: technical result).**

Proof.

Proof of the upper bound in (4) of Theorem 2.

Lemma 12**.**

Proof.

Proof of the upper bound in (6) of Theorem 5.

Proof of the upper bound in (7) of Theorem 5.

2.3 Straightforward extension to a certain sum of iid variables

Theorem 13** (Upper tail bounds: an extension).**

Appendix A Appendix: Lower bounds on the upper tail

A.1 Basic argument for Theorem 1

Lemma 14** (Clustering).**

Proof of Lemma 14.

Lemma 15** (Disjoint approximation).**

Proof.

Proof of the lower bound in (3) of Theorem 1.

A.2 Refined arguments for Theorem 2 and 5

Lemma 16** (Refined clustering bound).**

Lemma 17** (Disjoint approximation: Chernoff-type lower bound).**

Lemma 18** (Deviation in number of edges: sub-Gaussian type lower bound).**

Remark 19**.**

Proof of the lower bound in (7) of Theorem 5.

Proof of the lower bound in (6) of Theorem 5.

Proof of the lower bound in (4) of Theorem 2.

Theorem 1 (Upper tail problem for constant $\varepsilon$ ).

Theorem 2 (Upper tail problem for $\varepsilon=\varepsilon(n)\geq n^{-\alpha}$ ).

Remark 3.

Conjecture 4 (Correct upper tail behaviour).

Theorem 5 (General upper tail bounds: moderate deviations and clustered regime).

Lemma 6.

Theorem 7 (Corollary of [31, Theorem 9]).

Corollary 8.

Lemma 9.

Lemma 10.

Theorem 11 (Upper tail bounds: technical result).

Lemma 12.

Theorem 13 (Upper tail bounds: an extension).

Lemma 14 (Clustering).

Lemma 15 (Disjoint approximation).

Lemma 16 (Refined clustering bound).

Lemma 17 (Disjoint approximation: Chernoff-type lower bound).

Lemma 18 (Deviation in number of edges: sub-Gaussian type lower bound).

Remark 19.