On the dense Preferential Attachment Graph models and their graphon   induced counterpart

\'Agnes Backhausz; D\'avid Kunszenti-Kov\'acs

arXiv:1701.06760·math.CO·January 25, 2017·J. Appl. Probab.

On the dense Preferential Attachment Graph models and their graphon induced counterpart

\'Agnes Backhausz, D\'avid Kunszenti-Kov\'acs

PDF

TL;DR

This paper compares the dense Preferential Attachment Graph (PAG) model with its graphon-based W-random graph counterpart, providing bounds on their expected distance and insights into their convergence behavior.

Contribution

It introduces a coupling method to bound the expected jumble norm distance between PAG and W-random graphs, advancing understanding of their relationship.

Findings

01

Expected jumble norm distance bounded by O(log^2 n * n^{-1/3})

02

Universal lower bound established independent of coupling

03

Analysis enhances understanding of PAG convergence to graphons

Abstract

Letting $M$ denote the space of finite measures on $N$ , and $μ_{λ} \in M$ denote the Poisson distribution with parameter $λ$ , the function $W : [0, 1]^{2} \to M$ given by \[ W(x,y)=\mu_{c\log x\log y} \] is called the PAG graphon with density $c$ . It is known that this is the limit, in the multigraph homomorphism sense, of the dense Preferential Attachment Graph (PAG) model with edge density $c$ . This graphon can then in turn be used to generate the so-called W-random graphs in a natural way. The aim of this paper is to compare the dense PAG model with the W-random graph model obtained from the corresponding graphon. Motivated by the multigraph limit theory, we investigate the expected jumble norm distance of the two models in terms on the number of vertices $n$ . We present a coupling for which the expectation can be bounded from above by…

Equations303

W (x, y) = μ_{c l o g x l o g y}

W (x, y) = μ_{c l o g x l o g y}

d_{\boxtimes}(G,H)=\frac{1}{n}\cdot\max_{S,T\subseteq[n]}\frac{1}{\sqrt{st}}\bigg{|}\sum_{i\in S,j\in T}U_{ij}-V_{ij}\bigg{|},

d_{\boxtimes}(G,H)=\frac{1}{n}\cdot\max_{S,T\subseteq[n]}\frac{1}{\sqrt{st}}\bigg{|}\sum_{i\in S,j\in T}U_{ij}-V_{ij}\bigg{|},

W (x, y) = μ_{c l o g x l o g y},

W (x, y) = μ_{c l o g x l o g y},

\mathbb{E}\big{(}d_{\boxtimes}\big{(}\mathbb{G}_{\mathrm{PAG}}(n),\mathbb{G}_{W}(n)\big{)}\big{)}\leq K(\alpha)\cdot\log^{2}n\cdot n^{\beta},

\mathbb{E}\big{(}d_{\boxtimes}\big{(}\mathbb{G}_{\mathrm{PAG}}(n),\mathbb{G}_{W}(n)\big{)}\big{)}\leq K(\alpha)\cdot\log^{2}n\cdot n^{\beta},

\mathbb{E}\big{(}d_{\boxtimes}\big{(}\mathbb{G}_{1}(n,\alpha),\mathbb{G}_{2}(n,\alpha)\big{)}\big{)}\leq K_{1,2}\cdot\log n\cdot n^{\alpha-2}\qquad(n=1,2,\ldots)

\mathbb{E}\big{(}d_{\boxtimes}\big{(}\mathbb{G}_{1}(n,\alpha),\mathbb{G}_{2}(n,\alpha)\big{)}\big{)}\leq K_{1,2}\cdot\log n\cdot n^{\alpha-2}\qquad(n=1,2,\ldots)

\mathbb{E}\big{(}d_{\boxtimes}\big{(}\mathbb{G}_{2}(n,\alpha),\mathbb{G}_{3}(n,\alpha)\big{)}\big{)}\leq K_{2,3}\cdot\log^{2}n\cdot\left(n^{1/2-\alpha/2}+n^{\alpha-2}\right)

\mathbb{E}\big{(}d_{\boxtimes}\big{(}\mathbb{G}_{2}(n,\alpha),\mathbb{G}_{3}(n,\alpha)\big{)}\big{)}\leq K_{2,3}\cdot\log^{2}n\cdot\left(n^{1/2-\alpha/2}+n^{\alpha-2}\right)

\mathbb{E}\big{(}d_{\boxtimes}\big{(}\mathbb{G}_{3}(n,\alpha),\mathbb{G}_{4}(n,\alpha)\big{)}\big{)}\leq K_{3,4}\cdot\log n\cdot n^{\alpha-2}

\mathbb{E}\big{(}d_{\boxtimes}\big{(}\mathbb{G}_{3}(n,\alpha),\mathbb{G}_{4}(n,\alpha)\big{)}\big{)}\leq K_{3,4}\cdot\log n\cdot n^{\alpha-2}

\mathbb{E}\big{(}d_{\boxtimes}\big{(}\mathbb{G}_{4}(n,\alpha),\mathbb{G}_{5}(n)\big{)}\big{)}\leq K_{4,5}\cdot n^{-10}

\mathbb{E}\big{(}d_{\boxtimes}\big{(}\mathbb{G}_{4}(n,\alpha),\mathbb{G}_{5}(n)\big{)}\big{)}\leq K_{4,5}\cdot n^{-10}

C_{i} = ⌈ ξ_{i} n^{α - 1} ⌉ (i = 1, \dots, n) .

C_{i} = ⌈ ξ_{i} n^{α - 1} ⌉ (i = 1, \dots, n) .

P (X_{1}^{*} = k_{1}, \dots, X_{n}^{*} = k_{n} i = 1 \sum n X_{i}^{*} = s) = (k _{1} - 1 s) (k _{2} - 1 s - k _{1} + 1) \dots (k _{n - 1} - 1 s - k _{1} - \dots - k _{n - 2} + n - 2) \cdot \frac{( k _{1} - 1 )! \dots ( k _{n} - 1 )!}{n ( n + 1 ) \dots ( n + s - 1 )} = \frac{s ! ( n - 1 )!}{( n + s - 1 )!} = (n - 1 n + s - 1)^{- 1} .

P (X_{1}^{*} = k_{1}, \dots, X_{n}^{*} = k_{n} i = 1 \sum n X_{i}^{*} = s) = (k _{1} - 1 s) (k _{2} - 1 s - k _{1} + 1) \dots (k _{n - 1} - 1 s - k _{1} - \dots - k _{n - 2} + n - 2) \cdot \frac{( k _{1} - 1 )! \dots ( k _{n} - 1 )!}{n ( n + 1 ) \dots ( n + s - 1 )} = \frac{s ! ( n - 1 )!}{( n + s - 1 )!} = (n - 1 n + s - 1)^{- 1} .

\mathbb{P}(C_{i}\geq k)=\mathbb{P}(\xi_{i}n^{\alpha-1}>k-1)=\exp\left(-\frac{k-1}{n^{\alpha-1}}\right)=\bigg{(}\exp\left(-\frac{1}{n^{\alpha-1}}\right)\bigg{)}^{k-1}.

\mathbb{P}(C_{i}\geq k)=\mathbb{P}(\xi_{i}n^{\alpha-1}>k-1)=\exp\left(-\frac{k-1}{n^{\alpha-1}}\right)=\bigg{(}\exp\left(-\frac{1}{n^{\alpha-1}}\right)\bigg{)}^{k-1}.

P (C_{1} = k_{1}, \dots, C_{n} = k_{n}) = (1 - p_{α})^{k_{1} - 1} p_{α} \dots (1 - p_{α})^{k_{n} - 1} p_{α} = p_{α}^{n} (1 - p_{α})^{\sum_{i = 1}^{n} k_{i} - n} .

P (C_{1} = k_{1}, \dots, C_{n} = k_{n}) = (1 - p_{α})^{k_{1} - 1} p_{α} \dots (1 - p_{α})^{k_{n} - 1} p_{α} = p_{α}^{n} (1 - p_{α})^{\sum_{i = 1}^{n} k_{i} - n} .

\mathbb{P}\left(C_{1}=k_{1},\ldots,C_{n}=k_{n}\bigg{|}\sum_{i=1}^{n}C_{i}=s\right)=\binom{n+s-1}{n-1}^{-1},

\mathbb{P}\left(C_{1}=k_{1},\ldots,C_{n}=k_{n}\bigg{|}\sum_{i=1}^{n}C_{i}=s\right)=\binom{n+s-1}{n-1}^{-1},

R_{i}^{*} = \frac{⌈ ξ _{i} n ^{α - 1} ⌉}{\sum _{j = 1}^{n} ⌈ ξ _{j} n ^{α - 1} ⌉} = \frac{C _{i}}{\sum _{k = 1}^{n} C _{k}} .

R_{i}^{*} = \frac{⌈ ξ _{i} n ^{α - 1} ⌉}{\sum _{j = 1}^{n} ⌈ ξ _{j} n ^{α - 1} ⌉} = \frac{C _{i}}{\sum _{k = 1}^{n} C _{k}} .

\begin{array}[]{lcr}Y_{ij}:=H_{ij}+\mathbb{I}(\xi_{i}\xi_{j}<R_{i}^{*}R_{j}^{*})H^{*}_{ij}&\mbox{ and }&Z_{ij}:=H_{ij}+\mathbb{I}(\xi_{i}\xi_{j}>R_{i}^{*}R_{j}^{*})H^{*}_{ij}.\end{array}

\begin{array}[]{lcr}Y_{ij}:=H_{ij}+\mathbb{I}(\xi_{i}\xi_{j}<R_{i}^{*}R_{j}^{*})H^{*}_{ij}&\mbox{ and }&Z_{ij}:=H_{ij}+\mathbb{I}(\xi_{i}\xi_{j}>R_{i}^{*}R_{j}^{*})H^{*}_{ij}.\end{array}

\mathbb{E}\big{(}d_{\boxtimes}\big{(}\mathbb{G}_{5}(n,\alpha),\mathbb{G}_{6}(n)\big{)}\big{)}\leq K_{5,6}\cdot(\log n)^{1/2}\cdot\big{(}n^{-1/2}+n^{4-3\alpha}\big{)}

\mathbb{E}\big{(}d_{\boxtimes}\big{(}\mathbb{G}_{5}(n,\alpha),\mathbb{G}_{6}(n)\big{)}\big{)}\leq K_{5,6}\cdot(\log n)^{1/2}\cdot\big{(}n^{-1/2}+n^{4-3\alpha}\big{)}

\mathbb{E}\big{(}d_{\boxtimes}\big{(}\mathbb{G}_{6}(n),\mathbb{G}_{7}(n)\big{)}\big{)}\leq K_{6,7}\cdot n^{-3/4}

\mathbb{E}\big{(}d_{\boxtimes}\big{(}\mathbb{G}_{6}(n),\mathbb{G}_{7}(n)\big{)}\big{)}\leq K_{6,7}\cdot n^{-3/4}

d_{\boxtimes}(G,H)=\frac{1}{n}\cdot\max_{S,T}\frac{1}{\sqrt{st}}\bigg{|}\sum_{i\in S,j\in T}U_{ij}-V_{ij}\bigg{|}\leq\frac{1}{n}\cdot\max_{1\leq i\leq n}\ \sum_{j=1}^{n}|U_{ij}-V_{ij}|.

d_{\boxtimes}(G,H)=\frac{1}{n}\cdot\max_{S,T}\frac{1}{\sqrt{st}}\bigg{|}\sum_{i\in S,j\in T}U_{ij}-V_{ij}\bigg{|}\leq\frac{1}{n}\cdot\max_{1\leq i\leq n}\ \sum_{j=1}^{n}|U_{ij}-V_{ij}|.

\bigg{|}\sum_{i\in S,j\in T}U_{ij}-V_{ij}\bigg{|}\leq\sum_{i\in S,j\in T}|U_{ij}-V_{ij}|\leq\sum_{i\in S}\sigma_{i}\leq s\max_{1\leq i\leq n}\sigma_{i}.

\bigg{|}\sum_{i\in S,j\in T}U_{ij}-V_{ij}\bigg{|}\leq\sum_{i\in S,j\in T}|U_{ij}-V_{ij}|\leq\sum_{i\in S}\sigma_{i}\leq s\max_{1\leq i\leq n}\sigma_{i}.

\frac{1}{\sqrt{st}}\bigg{|}\sum_{i\in S,j\in T}U_{ij}-V_{ij}\bigg{|}\leq\frac{s\max_{1\leq i\leq n}\sigma_{i}}{\sqrt{st}}=\frac{\sqrt{s}}{\sqrt{t}}\max_{1\leq i\leq n}\sigma_{i}\leq\max_{1\leq i\leq n}\sigma_{i},

\frac{1}{\sqrt{st}}\bigg{|}\sum_{i\in S,j\in T}U_{ij}-V_{ij}\bigg{|}\leq\frac{s\max_{1\leq i\leq n}\sigma_{i}}{\sqrt{st}}=\frac{\sqrt{s}}{\sqrt{t}}\max_{1\leq i\leq n}\sigma_{i}\leq\max_{1\leq i\leq n}\sigma_{i},

\mathbb{E}\big{(}d_{\boxtimes}\big{(}\mathbb{G}_{1}(n,\alpha),\mathbb{G}_{2}(n,\alpha)\big{)}\big{)}\leq\mathbb{E}\big{(}\min\big{(}\max_{1\leq i\leq n}X_{i}^{*},cn^{2}\big{)}\big{)}=\mathbb{E}\big{(}\min\big{(}\max_{1\leq i\leq n}C_{i},cn^{2}\big{)}\big{)}.

\mathbb{E}\big{(}d_{\boxtimes}\big{(}\mathbb{G}_{1}(n,\alpha),\mathbb{G}_{2}(n,\alpha)\big{)}\big{)}\leq\mathbb{E}\big{(}\min\big{(}\max_{1\leq i\leq n}X_{i}^{*},cn^{2}\big{)}\big{)}=\mathbb{E}\big{(}\min\big{(}\max_{1\leq i\leq n}C_{i},cn^{2}\big{)}\big{)}.

P (1 \leq i \leq n max C_{i} > 3 lo g n \cdot n^{α - 1} + 1) \leq i = 1 \sum n P (C_{i} > 3 lo g n \cdot n^{α - 1} + 1) \leq n e^{- 3 l o g n} = \frac{1}{n ^{2}} .

P (1 \leq i \leq n max C_{i} > 3 lo g n \cdot n^{α - 1} + 1) \leq i = 1 \sum n P (C_{i} > 3 lo g n \cdot n^{α - 1} + 1) \leq n e^{- 3 l o g n} = \frac{1}{n ^{2}} .

R_{i, t} = \frac{X _{i, t}}{t + n}; R_{i}^{*} = \frac{X _{i}^{*}}{r + n} .

R_{i, t} = \frac{X _{i, t}}{t + n}; R_{i}^{*} = \frac{X _{i}^{*}}{r + n} .

P (p > \frac{16}{n} lo g n) \leq n^{- 8} .

P (p > \frac{16}{n} lo g n) \leq n^{- 8} .

P (n \leq t \leq c n^{2} max R_{i, t} > \frac{36}{n} lo g n) \leq 2 c n^{- 6} .

P (n \leq t \leq c n^{2} max R_{i, t} > \frac{36}{n} lo g n) \leq 2 c n^{- 6} .

P (p > \frac{16}{n} lo g n) = \int_{16 l o g n / n}^{1} (n - 1) (1 - x)^{n - 2} d x = \int_{0}^{1 - 16 l o g n / n} (n - 1) x^{n - 2} d x = (1 - 16 lo g n / n)^{n - 1} \leq exp (- 8 lo g n) = n^{- 8} .

P (p > \frac{16}{n} lo g n) = \int_{16 l o g n / n}^{1} (n - 1) (1 - x)^{n - 2} d x = \int_{0}^{1 - 16 l o g n / n} (n - 1) x^{n - 2} d x = (1 - 16 lo g n / n)^{n - 1} \leq exp (- 8 lo g n) = n^{- 8} .

P

P

\displaystyle=\mathbb{P}\left(X_{i,t}>\frac{36(t+n)}{n}\log n\bigg{|}p\leq\frac{16}{n}\log n\right)+n^{-8}

\leq \frac{E (( 1 + ( e - 1 ) p ) ^{t} ∣ p \leq \frac{16}{n} lo g n )}{exp ( lo g n \cdot 36 ( t + n ) / n )} + n^{- 8} \leq \frac{exp (( e - 1 ) t \cdot \frac{16}{n} lo g n )}{exp ( lo g n \cdot 36 ( t + n ) / n )} + n^{- 8}

\leq exp (((e - 1) \cdot 16 - 36) \frac{t}{n} lo g n) + n^{- 8} \leq exp (- 8 lo g n) + n^{- 8} \leq 2 n^{- 8},

B_{m}=\bigg{\{}\frac{3600\log n}{m}<p<\frac{16\log n}{n}\bigg{\}}.

B_{m}=\bigg{\{}\frac{3600\log n}{m}<p<\frac{16\log n}{n}\bigg{\}}.

\mathbb{P}\left(\bigg{\{}|\eta-mp|\geq K_{1}\sqrt{\frac{m}{n}}\log n\bigg{\}}\cap B_{m}\right)=O(n^{-8}).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

On the dense Preferential Attachment Graph models and their graphon induced counterpart

Ágnes Backhausz

Eötvös Loránd University and MTA Alfréd Rényi Institute of Mathematics

Pázmány Péter sétány 1/c, H-1117, Budapest, Hungary

[email protected]

and

Dávid Kunszenti-Kovács

MTA Alfréd Rényi Institute of Mathematics

P.O. Box 127, H-1364 Budapest, Hungary

[email protected]

Abstract.

Letting $\mathcal{M}$ denote the space of finite measures on $\mathbb{N}$ , and $\mu_{\lambda}\in\mathcal{M}$ denote the Poisson distribution with parameter $\lambda$ , the function $W:[0,1]^{2}\to\mathcal{M}$ given by

[TABLE]

is called the PAG graphon with density $c$ . It is known that this is the limit, in the multigraph homomorphism sense, of the dense Preferential Attachment Graph (PAG) model with edge density $c$ . This graphon can then in turn be used to generate the so-called W-random graphs in a natural way.

The aim of this paper is to compare the dense PAG model with the W-random graph model obtained from the corresponding graphon. Motivated by the multigraph limit theory, we investigate the expected jumble norm distance of the two models in terms on the number of vertices $n$ . We present a coupling for which the expectation can be bounded from above by $O(\log^{2}n\cdot n^{-1/3})$ , and provide a universal lower bound that is coupling independent, but with a worse exponent.

Key words and phrases:

dense graph limits, Pólya urn processes, cut norm, jumble norm

2010 Mathematics Subject Classification:

Primary: 05C80

1. Introduction

Preferential attachment graphs (PAGs) form a group of random growing graph models that have been studied for a long time [2, 5, 8]. The main motivation is modelling randomly evolving large real-world networks, like online and offline social networks, the internet, or biological networks (e.g. protein-protein interactions). The basic PAG models have been extended by various features, for example duplication steps, weighted edges, vertices with random fitness. The study of this wide family of models provided information about several phenomena in real-world networks (asymptotic degree distribution, clustering, relation of local and global properties, epidemic spread). The limiting behaviour of PAG models has also been investigated from various points of view, depending somewhat on the edge density along the graph sequences. For instance, in [3], N. Berger, C. Borgs, J. T. Chayes and A. Saberi consider a sparse version of the process, with a linear number of edges compared to the number of vertices, and prove convergence in the sense of Benjamini–Schramm to a Pólya point graph. A variation with added randomness is considered by R. Elwes in [6, 7], where the preferential attachment model is amended in such a way that the number of edges added at each stage itself is a random variable, but in expectation still preserves a linear growth. The limit here is the infinite Rado graph, or a multigraph variant of the same, depending on whether multiple edges are allowed during the process.

At the dense end of the spectrum, C. Borgs, J. Chayes, L. Lovász, V. Sós and K. Vesztergombi considered in [4] the case when the edge density along the sequence is essentially constant $c$ (i.e. the number of edges is approximately $cn^{2}/2$ ), under the convergence notion of injective graph densities. They showed that with probability 1 the graph sequence converges to the graphon $W:[0,1]^{2}\to\mathbb{R}$ given by $W(x,y)=c\ln x\ln y$ . Later, B. Ráth and L. Szakács considered in [13] convergence of a more general family of processes with respect to induced graph densities, showing that the limit object is a graphon that now takes Poisson distributions as values instead.

If instead of considering induced densities, we look for homomorphism densities, the limit object can be seen to be in some sense a combination of the two previously mentioned ones: we obtain a graphon with $W(x,y)$ being a Poisson distribution with parameter $c\ln x\ln y$ (i.e., the injective density limit is the first moment of the homomorphism density limit). Hence the corresponding graphs contain multiple edges, and the original notions for limits of simple graphs cannot be used any more. The paper [10] by K.-K., L. Lovász and B. Szegedy provides a framework for handling homomorphism densities in the context of multigraphs, and makes use of the so-called jumble-norm to measure distance between graphons.

All of the papers [4, 13, 10] also deal with $W$ -random graph sequences induced by the limit objects $W$ , and show that with probability 1, the resulting graph sequence converges to $W$ in the respective densities sense. These $W$ -random graph models are thus very similar to the classical graph sequences that gave rise to the limit $W$ , but also exhibit some significant differences.

Our goal in this paper is to compare the $c$ -dense preferential attachment graph model to its $W$ -random counterpart, showing that with probability 1 they are close (but not too close) in the jumble distance. The idea of the proof of the main result is to define a family of random graph models (see Section 3), which connects the $W$ -random graph and the PAG model, and which can be coupled (see Section 4) so that the pairwise jumble-norm distances are easier to bound. In the discussion part (Section 6), we point out some features of the $W$ -random version that can make it more useful in certain applications.

2. Terminology and main result

We shall start by defining the distance notion between multigraphs that we intend to use in this paper. It may be defined more generally for graphons (which essentially are weighted graphs with vertex set $[0,1]$ ), but that shall not be needed here, and we refer to [10] for more details.

Definition 1.

Let $G$ and $H$ be two (multi-)graphs on the same vertex set $[n]:=\left\{1,\ldots,n\right\}$ for some positive integer $n$ . Then we define their jumble norm distance as

[TABLE]

where $U_{ij}$ and $V_{ij}$ denote the multiplicity of edge $ij$ in $G$ and $H$ , respectively.

The cut norm distance $d_{\square}$ used in many other papers (see e.g. [4] for details) differs from this in the factor $\frac{1}{\sqrt{st}}$ that is omitted there. As such, our current distance notion magnifies the differences that occur on small sets, and we clearly have $d_{\boxtimes}\geq d_{\square}$ . Also the jumble norm distance can be considered as an $L^{2}$ -version of the cut norm distance, since $\sqrt{st}$ corresponds to the $L^{2}$ norm of the characteristic function of the set $S\times T$ .

Next, fix a positive parameter $c>0$ . Let $\mathcal{M}$ denote the space of finite measures on $\mathbb{N}$ , and $W:[0,1]^{2}\to\mathcal{M}$ be the function given by

[TABLE]

where $\mu_{\lambda}$ denotes the Poisson distribution with parameter $\lambda$ . We want to define the notion of $W$ -random (multi-)graphs. The essence of the two-step randomization is as follows. We consider the set $[0,1]$ as the vertex set of the infinite graph with “adjacency function” $W$ , and sample a random spanned subgraph on $n$ vertices by choosing its vertices independently uniformly from $[0,1]$ . After this first randomization, we obtain a “graph” on $n$ vertices where each “edge” is a Poisson distribution. To obtain a true multigraph, we then independently sample an edge multiplicity for each pair of vertices from the corresponding Poisson distribution. If we allow loops, this will correspond to the random graph $\mathbb{G}_{W}^{\circ}(n)$ , whereas if loops are disallowed, we obtain the random graph $\mathbb{G}_{W}(n)$ .

Definition 2.

We choose independent exponential random variables $\xi_{i}$ with parameter $1$ for every $1\leq i\leq n$ . For $i<j$ , let $Y_{ij}$ be a Poisson random variable with parameter $c\xi_{i}\xi_{j}$ . For every $i$ , let $Y_{ii}$ be a Poisson random variable with parameter $c\xi_{i}^{2}/2$ . Assume that all $Y_{ij}$ s are conditionally independent with respect to the $\xi_{i}$ s. We put $Y_{ij}$ edges between vertices $i$ and $j$ for every $1\leq i\leq j\leq n$ . This yields a random multigraph $\mathbb{G}_{\rm W}^{\circ}(n)$ .

If, compared to $\mathbb{G}_{\rm W}(n)$ , we erase the loops, we obtain the random multigraph $\mathbb{G}_{\rm W}(n)$ .

Remark.

Note that using exponential variables instead of the uniform $[0,1]$ valued ones is compensated by the loss of the $log$ in the parameter.

These are the random models we wish to compare to the below version of the PAG model.

Definition 3.

We assign an urn to each vertex, initially with one single ball in each of them. Then we run a Pólya urn process for $\lfloor cn^{2}\rfloor$ steps. That is, for $t=1,2,\ldots,\lfloor cn^{2}\rfloor$ , at step $t$ , we choose an urn, with probabilities proportional to the number of balls inside the urn, and put a new ball into it (each random choice is conditionally independent from the previous steps, given the actual distribution of the balls). Finally, for $k=1,2,\ldots,\big{\lfloor}\lfloor cn^{2}\rfloor/2\big{\rfloor}$ , we add an edge between the vertices where the balls at step $t=2k-1$ and at step $t=2k$ have been placed. This yields the random multigraph $\mathbb{G}_{\mathrm{PAG}}(n)$ ; multiple edges and loops may occur.

It was proved in [10] that with probability 1, the random graph $\mathbb{G}_{6}(n)$ converges with respect to multigraph homomorphism densities to the original function $W$ . As mentioned in the introduction, this is also the limit object obtained when looking at the random graphs $\mathbb{G}_{1}(n)$ defined as the preferential attachment graph on $n$ vertices with $\lfloor cn^{2}\rfloor$ edges.

Given that letting $n$ go to infinity, the two random sequences $\mathbb{G}_{1}(n)$ and $\mathbb{G}_{6}(n)$ tend to the same limit, it is natural to ask how close these two sequences are as a function of $n$ .

Our main result is that under an appropriate coupling, we obtain a polynomial bound on the expected distance.

Theorem 1.

There exists a coupling for which for every $1<\alpha<2$ there exists $K(\alpha)>0$ such that for every $n\geq 1$ we have

[TABLE]

where $\beta:=\max_{\alpha\in(1,2)}\left\{\alpha-2,\frac{1-\alpha}{2},-1/2,4-3\alpha\right\}$ . With this bound, the optimum value for $\alpha$ is $5/3$ , yielding $\beta=-1/3$ .

In the last section, we provide a universal, coupling-independent lower bound of $O(n^{-1})$ . The exponents are far from each other, but the lower bound uses very little of the structure of the models, so there is room for improvement.

3. Random graph models

We define a family of random graph models such that the neighboring ones are easier to compare in the jumble norm, and the whole family connects the two models of Theorem 1. In the next section we will also present possible couplings for these pairs of models, which provide a coupling satisfying the conditions of the theorem. A positive number $c>0$ will be a common parameter of all of the models, and it will be considered fixed for the rest of the paper. Model 1 will be a realization of $\mathbb{G}_{\mathrm{PAG}}(n)$ , whilst models 6 and 7 will be realizations of $\mathbb{G}_{\rm W}^{\circ}(n)$ and $\mathbb{G}_{\rm W}(n)$ , respectively.

The graphs will have $n$ vertices, labeled by $1,2,\ldots,n$ . The parameter $\alpha$ will be chosen later so that the bounds are the best possible available from our approach.

Model 1

We assign an urn to each vertex, initially with one single ball in each of them. Then we run a Pólya urn process for $\lfloor cn^{2}\rfloor$ steps. That is, for $t=1,2,\ldots,\lfloor cn^{2}\rfloor$ , at step $t$ , we choose an urn, with probabilities proportional to the number of balls inside the urn, and put a new ball into it (each random choice is conditionally independent from the previous steps, given the actual distribution of the balls). Finally, for $k=1,2,\ldots,\big{\lfloor}\lfloor cn^{2}\rfloor/2\big{\rfloor}$ , we add an edge between the vertices where the balls at step $t=2k-1$ and at step $t=2k$ have been placed. We obtain a random multigraph $\mathbb{G}_{1}(n)$ this way; multiple edges and loops may occur.

Model 2

Fix $\alpha\geq 0$ . Let $r^{\prime}$ be a random variable with negative binomial distribution, with parameters $n$ and $p_{\alpha}=1-e^{-\frac{1}{n^{\alpha-1}}}$ (we mean the version of negative binomial distribution with possible values $n,n+1,\ldots$ ). Let $r=r^{\prime}-n$ ; this has values $0,1,\ldots$ (sometimes this distribution is called negative binomial). The urn process is the same as in model $1$ (independent of $r^{\prime}$ ), but we add edges between vertices chosen at step $t=2k-1$ and at step $t=2k$ only for $k\geq r/2$ (if $r>cn^{2}$ , then we get the empty graph). We obtain a random multigraph $\mathbb{G}_{2}(n,\alpha)$ .

Model 3

Let $\alpha$ and $r$ be defined as in model $2$ . For $t=1,2,\ldots,r$ , we run the Pólya urn as before. Let $R_{i}^{*}$ be the proportion of the balls in urn $i$ after $r$ steps (for $i=1,\ldots,n$ ). For $t=r+1,\ldots,\lfloor cn^{2}\rfloor$ , independently at each step, we put a new ball in an urn chosen randomly according to the distribution $(R_{i}^{*})$ . That is, the probability that the ball at step $t$ falls into urn $i$ is $R_{i}^{*}$ , for all $t=r+1,\ldots,\lfloor cn^{2}\rfloor$ . Finally, for $k\geq r/2$ , we add an edge between the vertices chosen at step $t=2k-1$ and at step $t=2k$ . (If $r>cn^{2}$ , we mean the empty graph.) We obtain $\mathbb{G}_{3}(n,\alpha)$ this way.

Model 4

Let $\alpha,r$ and $R_{i}^{*}$ be defined as in model $3$ . If $r>cn^{2}$ , take the empty graph. Otherwise, for every pair $1\leq i<j\leq n$ , we take a random variable $Z_{ij}$ with Poisson distribution of parameter $cn^{2}R_{i}^{*}R_{j}^{*}$ . For every $1\leq i\leq n$ , we take a random variable $Z_{ii}$ with Poisson distribution of parameter $cn^{2}(R_{i}^{*})^{2}/2$ . We assume that all $Z_{ij}$ s are conditionally independent of each other, given the $R_{i}^{*}$ s. Finally, we put $Z_{ij}$ edges between vertices $i$ and $j$ for every pair $1\leq i\leq j\leq n$ . We obtain $\mathbb{G}_{4}(n,\alpha)$ this way.

Model 5

Given $n$ and $\alpha$ , the model is the same as model $4$ except that $r$ is not included any more; the model is the same as the previous one in the non-empty case. We obtain $\mathbb{G}_{5}(n,\alpha)$ this way.

Model 6

We choose independent exponential random variables $\xi_{i}$ with parameter $1$ for every $1\leq i\leq n$ . For $i<j$ , let $Y_{ij}$ be a Poisson random variable with parameter $c\xi_{i}\xi_{j}$ . For every $i$ , let $Y_{ii}$ be a Poisson random variable with parameter $c\xi_{i}^{2}/2$ . Assume that all $Y_{ij}$ s are conditionally independent with respect to the $\xi_{i}$ s. We put $Y_{ij}$ edges between vertices $i$ and $j$ for every $1\leq i\leq j\leq n$ . We obtain a random multigraph $\mathbb{G}_{6}(n)$ this way.

Model 7

For every $1\leq i<j\leq n$ , let $Y_{ij}$ be defined as in model $6$ . We add $Y_{ij}$ edges between vertices $i$ and $j$ for all these pairs, but there are no loops in this case. We obtain $\mathbb{G}_{7}(n)$ this way.

4. Couplings

In order to prove Theorem 1, we need to construct a particular coupling for which the distance of $\mathbb{G}_{\rm PAG}$ and $\mathbb{G}_{\rm W}$ is smaller than the upper bound. We do this through a sequence of couplings between the consecutive pairs, with respect to the order of random graph models in the previous section. It will be easy to see that the coupling of the first one (which is a realization of $\mathbb{G}_{\rm PAG}$ ) and the last one (which is a realization of $\mathbb{G}_{\rm W}$ ) can be constructed following the same order. At each step, we can simply add a finite family of random variables to the probability space independently where necessary, and use the already existing random variables in the other cases.

Coupling of model $1$ and model $2$

These two models can be coupled easily. Take a realization of model $1$ , and delete the edges corresponding to steps $2k-1$ and $2k$ for $k<r/2$ . That is, we do not add the edges in the first $r$ steps.

Proposition 1.

For all $\alpha>1$ there exists $K_{1,2}>0$ such that

[TABLE]

holds in the coupling given above.

Coupling of model $2$ and model $3$

We start from a realization of model 2. Let $R_{i,t}$ be the proportion of the balls in urn $i$ after $t$ steps. Then, for $t=r+1,\ldots,\lfloor cn^{2}\rfloor$ , conditionally on the process in model 2 until $t-1$ steps, we choose a coupling of the distributions given by $(R_{i,t-1})_{i=1}^{n}$ and $(R_{i}^{*})_{i=1}^{n}$ which minimizes the probability of choosing different urns and which is conditionally independent from the couplings used in the previous steps (with respect to the evolution of the number of balls). After adding the edges, we get a realization of model 3, because the distributions are determined by $(R_{i}^{*})_{i=1}^{n}$ , and the steps are conditionally independent of each other (and there is no difference in the first $r$ steps).

Proposition 2.

For all $\alpha>1$ there exists $K_{2,3}>0$ such that for every $n\geq 1$ we have

[TABLE]

in the coupling given above.

Coupling of model $3$ and model $4$

The negative binomial random variable $r$ is common in the two models, this is chosen first. If $r>cn^{2}$ , then both models give the empty graph, so we assume the contrary, and construct the coupling given $r$ . Notice that in model $3$ , since all steps are independent and use the same probability distribution, the edges are chosen independently, with probabilities proportional to $2R_{i}^{*}R_{j}^{*}$ for $i\neq j$ and $(R_{i}^{*})^{2}$ for loops.

We assign independent Poisson processes to each pair of vertices. For $1\leq i<j\leq n$ , the rate of the process is $2R_{i}^{*}R_{j}^{*}$ for $(i,j)$ , and for $1\leq i\leq n$ , the rate is ${R_{i}^{*}}^{2}$ for $(i,i)$ . We denote by $N_{s}^{(ij)}$ the number of events until time $s$ in the $(i,j)$ process ( $s>0$ ). The sum of these processes is also a Poisson process; let $\tau$ be the time when the total number of events reaches $\lfloor(\lfloor cn^{2}\rfloor-r)/2\rfloor+1$ . If we put $N_{\tau}^{(ij)}$ edges between $i$ and $j$ for all $1\leq i\leq j\leq n$ , then we get model $3$ , because all $\tau$ events are distributed among the pairs of vertices independently, with probabilities proportional to the rates. On the other hand, if we put $N_{cn^{2}/2}^{(ij)}$ edges between $i$ and $j$ , then we get model $4$ , as the number of edges between the pairs are independent Poisson random variables with the appropriate parameter. Hence this provides a coupling of the two models.

Proposition 3.

For all $\alpha>1$ there exists $K_{3,4}>0$ such that for every $n\geq 1$ we have

[TABLE]

in the coupling given above.

Coupling of model $4$ and model $5$

For $r\leq cn^{2}$ , there is no difference between the two models. Whenever $r>cn^{2}$ , the graph $G_{4}$ is the empty graph, so no coupling is needed.

Proposition 4.

For all $2>\alpha>1$ there exists $K_{4,5}>0$ such that for every $n\geq 1$ we have

[TABLE]

in the coupling given above.

Coupling of model $5$ and model $6$

First, we wish to couple the exponential random variables $\xi_{i}$ with the variables $R_{i}^{*}$ from the Pólya urn. The following representation of the urn process until $r$ steps and its connection to independent exponential random variables yields a natural way to do this. In addition, this lemma will be useful when comparing models $1$ and $2$ as well.

Lemma 5.

Fix $\alpha>1$ . Let $r$ be defined as in model $2$ . Let $X_{i}^{*}$ be the number of balls in urn $i$ (for $1\leq i\leq n$ ) after $r$ steps (we continue the Pólya urn process even if $r>cn^{2}$ ). Let $\xi_{1},\ldots,\xi_{n}$ be independent random variables with exponential distribution of parameter $1$ . We define

[TABLE]

Then $(X_{1}^{*},\ldots,X_{n}^{*})$ and $(C_{1},\ldots,C_{n})$ have the same joint distribution.

Proof.

After $r$ steps, the total number of balls is $r+n$ ; that is, $\sum_{i=1}^{n}X_{i}^{*}=r+n$ . As it is well known, by the interchangeability property of the chosen colors in the urn process, for every $s\geq n$ and $\sum_{i=1}^{n}k_{i}=s$ we have

[TABLE]

On the other hand, for every $k\geq 0$ and $1\leq i\leq n$ , the definition of $C_{i}$ implies that

[TABLE]

Hence $C_{i}$ has geometric distribution of parameter $p_{\alpha}=1-e^{-\frac{1}{n^{\alpha-1}}}$ (where we mean the version with possible values $1,2,\ldots$ ). The random variables $C_{i}$ s are independent, thus $\sum_{i=1}^{n}C_{i}$ has the same negative binomial distribution as $r+n$ . Hence $\sum_{i=1}^{n}X_{i}^{*}$ and $\sum_{i=1}^{n}C_{i}$ have the same distribution. In addition, the conditional distributions given the sum are also the same, because we have

[TABLE]

This depends only on the sum of the $k_{i}$ s, which implies that

[TABLE]

just as we have seen in the previous case. ∎

Recall that the $R_{i}^{*}$ -s corresponded to the ratio of the colors in the urn after $r$ steps, and therefore the Pólya urn model can be coupled to the family of random variables $(\xi_{i})$ in such a way that

[TABLE]

Next we couple the Poisson random variables $Y_{ij}$ and $Z_{ij}$ for each pair $1\leq i\leq j\leq n$ . We exploit the fact that the sum of two independent Poisson distributions is again a Poisson distribution whose parameter is the sum of the original parameters. Let $\mathcal{F}$ be the $\sigma$ -algebra generated by the families $(\xi_{i})$ and $(R_{i}^{*})$ . Conditioned on $\mathcal{F}$ , the coupling is done so that for each pair $1\leq i<j\leq n$ , we generate independent Poisson random variables $H_{ij}$ and $H^{*}_{ij}$ of parameter $\mu_{ij}:=cn^{2}\min\{\xi_{i}\xi_{j},R_{i}^{*}R_{j}^{*}\}$ and $\mu_{ij}^{*}:=cn^{2}\left|\xi_{i}\xi_{j}-R_{i}^{*}R_{j}^{*}\right|$ respectively, and set

[TABLE]

For the variables $Y_{ii},Z_{ii}$ , the coupling is done similarly, with all parameters halved.

Proposition 6.

For all $\alpha>1$ there exists $K_{5,6}>0$ such that for every $n\geq 1$ we have

[TABLE]

in the coupling given above.

Coupling of model $6$ and model $7$

Generate $G_{6}$ , then delete the loops. This yields the natural coupling between $G_{6}$ and $G_{7}$ .

Proposition 7.

There exists $K_{6,7}>0$ such that for every $n\geq 1$ we have

[TABLE]

in the coupling given above.

We also conclude that this sequence of couplings can be realized in a single probability space, if we start with an appropriate family of independent random variables. Thus we constructed a coupling of $\mathbb{G}_{\rm PAG}$ and $\mathbb{G}_{\rm W}$ .

5. Proofs

Proof of Theorem 1

The result follows from the triangle inequality and Propositions 1 through 6. $\square$

We shall therefore now turn our attention to proving the bounds connecting each pair of models. Since the jumble norm distance is not always easy to work with, we shall make use of the following lemma.

Lemma 8.

Let $G$ and $H$ be two (undirected) multigraphs on the vertex set $\{1,2,\ldots,n\}$ . Let $U_{ij}$ be the number of edges between $i$ and $j$ in $G$ , and $V_{ij}$ the same quantity in $H$ . Then the following holds:

[TABLE]

Proof.

Let $\sigma_{i}=\sum_{j=1}^{n}|U_{ij}-V_{ij}|$ . Notice that if $|S|=s$ , $|T|=t$ , and $s\leq t$ , then

[TABLE]

Hence

[TABLE]

as we assumed that $s\leq t$ . In the reverse case $s\geq t$ , we get the same with the bound $\max_{1\leq j\leq n}\sum_{i=1}^{n}{|U_{ij}-V_{ij}|}$ . Since $U_{ij}=U_{ji}$ and $V_{ij}=V_{ji}$ , this is equal to the previous maximum. This finishes the proof. ∎

5.1. Models $1$ and $2$

Proof of Propositon 1

Let $U_{ij}$ be the number of edges between $i$ and $j$ in model $1$ , and $V_{ij}$ the number of edges between $i$ and $j$ in model $2$ . By the definition of the coupling, $U_{ij}$ can never be smaller than $V_{ij}$ . If $r<cn^{2}$ , then $U_{ij}-V_{ij}$ is the number of edges added to model $1$ during the first $r$ steps. Therefore $\sum_{j=1}^{n}|U_{ij}-V_{ij}|$ is at most the number of steps in which urn $i$ was chosen during the first $r$ steps, which is $X_{i}^{*}-1$ (cf. Lemma 5). Even if $r\geq cn^{2}$ , the sum $\sum_{j=1}^{n}|U_{ij}-V_{ij}|$ cannot be larger than $cn^{2}/2$ , since there are no more edges in model $1$ . By Lemma 8 and Lemma 5, we obtain

[TABLE]

Equation (1) implies

[TABLE]

Hence the expectation of the minimum is at most $3\log n\cdot n^{\alpha-1}$ plus some constant depending only on $c$ . This finishes the proof. $\square$

5.2. Models $2$ and $3$

The idea of the proof of Proposition 2 is to find the expected value of the maximum when all global random variables (like $r$ ) are close to their mean, and then use large deviation theorems to show that this is the case with high probability. Throughout this proof, the constant factor in the $O(\cdot)$ notation may depend only on $c$ .

First we fix $1\leq i\leq n$ . Let $X_{i,t}$ be the number of balls in urn $i$ after $t$ steps. Recall that $X_{i}^{*}$ denotes the number of balls in urn $i$ after $r$ steps. We define the proportions similarly (recall that the initial configuration consists of one ball at each urn):

[TABLE]

We will use an application of de Finetti’s theorem to the urn process $X_{t}$ (see e.g. Theorem 2.2. in [12]). The joint distribution of the urns chosen randomly can be represented as follows. Let $p$ be a random variable with distribution $\mathrm{Beta}(1,n-1)$ (as there is a single ball in urn $i$ at the beginning and $n-1$ balls in the other urns). Then, conditionally on $p$ , generate independent Bernoulli random variables taking value $1$ with probability $p$ . This has the same distribution as the indicators of the steps when a new ball is placed to urn $i$ . This representation has an immediate consequence on the maximum of the proportion.

Lemma 9.

(a)

Let $p$ be a random variable with distribution $\mathrm{Beta}(1,n-1)$ with $n\geq 1$ . Then we have

[TABLE] 2. (b)

For every $1\leq i\leq n$ we have

[TABLE]

Proof.

$(a)$ By using that $n-1\geq n/2$ , we have

[TABLE]

$(b)$ Using exponential Markov’s inequality and part $(a)$ , we have

[TABLE]

where we assumed that $t\geq n$ . This immediately implies $(b)$ . ∎

We will use the following lemma, which is based on a large deviation argument.

Lemma 10.

Fix integers $m\geq n\geq 2$ . Let $p$ be a random variable with distribution $\mathrm{Beta}(1,n-1)$ . Let $\eta$ be a random variable whose conditional distribution with respect to $p$ is binomial with parameters $m$ and $p$ . We define

[TABLE]

Then there exists $K_{1}>0$ such that

[TABLE]

Proof.

We will compare the difference $|\eta-mp|$ to the variance of the binomial distribution, given $p$ . We start with

[TABLE]

We will choose $K=6$ but keep writing $K$ for clarity. Since $B_{m}$ is measurable with respect to $p$ , the first term is equal to

[TABLE]

where $\mathbb{I}_{B_{m}}$ denotes the indicator function of the event $B_{m}$ .

We define $k=mp-K\sqrt{mp(1-p)\log n}$ and $k^{\prime}=mp+K\sqrt{mp(1-p)\log n}$ ; then the first event in (5) is $\{\eta/m<k/m\}\cup\{\eta/m>k^{\prime}/m\}$ . It is clear that $k/m<p$ and $k^{\prime}/m>p$ ; hence we can apply large deviation arguments. Furthermore, we have $k/m>0$ on the event $B_{m}$ , as the following calculation shows.

[TABLE]

We also need $k^{\prime}/m<1$ . That is, we have to check whether the following holds:

[TABLE]

Since we have $p<16\log n/n$ on $B_{m}$ and we assumed $m\geq n$ , this holds for large enough $n$ (recall that $K=6$ does not depend on any of the parameters).

Hence we can apply the relative entropy version of the Chernoff bound for binomial distributions, conditionally with respect to $p$ . We obtain

[TABLE]

where $D(a\|p)=a\log\frac{a}{p}+(1-a)\log\frac{1-a}{1-p}$ . We need the following quantities for the calculations.

[TABLE]

It is easy to check that $x>-0.1$ implies $\log(1+x)\geq x-2x^{2}/3$ . On the event $B_{m}$ we have $100K^{2}\cdot\frac{1-p}{mp}\log n<1$ , and hence $K\sqrt{\frac{1-p}{mp}\log n}<0.1$ . Therefore

[TABLE]

Similarly, we have

[TABLE]

Substituting this into the Chernoff bound, we obtain that for $q_{1}$ defined by equation (5) we have

[TABLE]

for $n$ large enough. As for the first term:

[TABLE]

for $n$ large enough. Hence the first term is $O(n^{-8})$ , as we have chosen $K=6$ . In the exponent of the second term, since $pm>100K^{2}\log n$ holds on $B_{m}$ , we get

[TABLE]

Putting this together, we conclude that $q_{1}=O(n^{-8})$ , which is a bound for the first term of (4). The second term of (4) can be bounded as follows.

[TABLE]

by equation (2), if $K_{1}^{2}\geq 16K^{2}=576$ . This finishes the proof. ∎

Now we compare the differences of the proportions after $r$ steps and the further steps. This will give the order of the distance in the coupling. We define

[TABLE]

Proposition 11.

Assuming $\alpha>1$ , there exists $K_{2},K_{3},K_{4},K_{5}>0$ such that for every fixed $1\leq i\leq n$ the following hold.

(a)

[TABLE] 2. (b)

[TABLE] 3. (c)

[TABLE] 4. (d)

We define

[TABLE]

Then for some $K_{5}>0$ we have

[TABLE] 5. (e)

For $K_{5}>0$ defined in $(d)$ , we have

[TABLE]

Proof.

We will assume that $r<cn^{2}$ ; otherwise the sums become empty, and $\Delta_{i}=0$ .

$(a)$ We will use the representation based on de Finetti’s theorem together with the following decomposition.

[TABLE]

According to the representation, we know that $X_{i,t}-X_{i}^{*}$ is a binomial random variable with parameters $m=t-r$ and $p$ , given $p$ and $r$ . We will use Lemma 10 for this conditional distribution. Notice that $B\cap\{t\geq r+n^{\alpha}\}\subseteq B_{m}$ , and $m\geq n$ in this case. Therefore for $K_{1}$ defined in Lemma 10 we have

[TABLE]

It follows that

[TABLE]

Similarly, $X_{i}^{*}-1$ is a binomial random variable with parameters $m=r$ and $p$ , given $p$ and $r$ . Again, we have that $B\cap\{t\geq r+n^{\alpha}\}\subseteq B_{m}$ . Thus Lemma 10 can be applied. We get that there exists $K_{1}^{\prime}>0$ such that

[TABLE]

This implies

[TABLE]

In addition, using that $r>n^{\alpha}/10$ holds on the event $B$ , we can write

[TABLE]

Now we reformulate the third term.

[TABLE]

By equation (2) we obtain

[TABLE]

Putting this together with equations (6) and (7), we obtain that there exists $K_{2}^{\prime}>0$ such that

[TABLE]

Since $\alpha>1$ and $t>r+n^{\alpha}$ , for $n$ large enough, the middle term is the largest one, and we conclude that for some $K_{2}>0$

[TABLE]

This finishes the proof of $(a)$ .

$(b)$ It follows from part $(a)$ that

[TABLE]

On $B$ , we have $r>n^{\alpha}/10>n$ , as $\alpha>1$ , for large enough $n$ . By equation (3) we get that

[TABLE]

The two equations together imply the statement.

$(c)$ Similarly to the proof of Lemma 9, for every $t\geq n^{\alpha}/10$ we have

[TABLE]

Therefore, writing

[TABLE]

we have

[TABLE]

because on the event $\{r>n^{\alpha}/10\}$ we have $t>n^{\alpha}/10$ in all terms (and the inequality is valid for $R_{i}^{*}=R_{i,r}$ as well).

For $K_{4}$ large enough (which may depend only on $c$ ), the condition

[TABLE]

implies that either the event in part $(b)$ , or the event in inequality (8), or $\{p>16\log n/n\}$ holds, according to the value of $p$ . Notice that for $\alpha>1$ we have $2-\alpha<3/2-\alpha/2$ , hence for large enough $n$ we can get rid of the maximum. Thus, combining these inequalities with part $(a)$ of Lemma 9, we get the statement of $(c)$ .

$(d)$ For the first term of $\Delta_{i}$ , we know this statement with constant $K_{4}$ from part $(c)$ . We may assume that $n$ is so large that $n^{\alpha}/10\geq n$ holds. Then we can apply Lemma 9 to get

[TABLE]

On the other hand, if $\max_{r\leq t\leq\lfloor cn^{2}\rfloor}R_{i,t}\leq\frac{16\log n}{n}$ holds and the second term of $\Delta_{i}$ is greater than the bound in $(d)$ , then

[TABLE]

holds. By choosing $K_{5}=16K_{4}$ , this implies that for some $1\leq k\leq n$ we have

[TABLE]

Putting this together with part $(c)$ , this finishes the proof of $(d)$ (notice that $K_{4}$ does not depend on $i$ ).

$(e)$ To see that $(d)$ implies $(e)$ , we only have to check that

[TABLE]

Recall that the random variable $r^{\prime}=r+n$ has negative binomial distribution with parameters $n$ and $p_{\alpha}=1-\mathrm{exp}(-n^{-\alpha+1})$ . For $n$ large enough, the inequality $\mathbb{P}(r\leq n^{\alpha}/10)\leq\mathbb{P}(r^{\prime}\leq n^{\alpha}/5)$ holds and we also have

[TABLE]

Notice that $r^{\prime}$ can be expressed as the independent sum of $n$ geometric random variables supported on $\mathbb{N}^{+}$ with mean $m=1/p_{\alpha}$ . Thus, we compare $r^{\prime}/n$ to $n^{\alpha-1}/5$ , which is less than the mean of the geometric random variables. Hence we can apply Cramér’s theorem for $b=n^{\alpha-1}/5$ . We obtain that

[TABLE]

where $M(\vartheta)$ is the moment generating function of this geometric random variables, and $\vartheta$ minimizes the expression in the exponent. That is, we have

[TABLE]

This yields

[TABLE]

It follows from inequality (10) that for $n$ large enough we have

[TABLE]

Since we assumed that $\alpha>1$ , this implies inequality (9). ∎

Proof of Proposition 2. If $r>cn^{2}$ , then both models give the empty graph and the distance is [math]; we will ignore this case. For $t$ odd, let $\mathbb{I}_{i,t}$ be the indicator of the following event: either vertex $i$ gets different edges at step $(t,t+1)$ in the coupling of model 2 and model 3, or it gets an edge in exactly one of the models. For $t$ even, let $\mathbb{I}_{i,t}=0$ . We will be interested in $Z_{i}=\sum_{t=r+1}^{\lfloor cn^{2}\rfloor}\mathbb{I}_{i,t}$ . In addition, we define

[TABLE]

Whenever $\mathbb{I}_{i,t}$ takes value $1$ , we either choose vertex $i$ in exactly one of the models at step $t$ or $t+1$ , or we choose vertex $i$ in both models, but it gets different pairs in the two models. Thus, by the definition of the coupling, we have that

[TABLE]

A slight modification of Proposition 11 implies that for some $K_{6}>0$ we have

[TABLE]

To see this, note that the sum for the first two terms for odd $t$ gives the first term of $\Delta_{i}$ defined in part $(d)$ of Proposition 11. The third term here corresponds to the second term of $\Delta_{i}$ with even $t$ s omitted. Finally, for the fourth term it is easy to see that the proof of Proposition 11 is valid if $t+1$ is replaced by $t-1$ .

Let $D$ be event in equation (11), and let $k_{n}=K_{6}\log^{2}n\cdot\big{(}n^{3/2-\alpha/2}+n^{\alpha-1}\big{)}$ . By using that $D\in\mathcal{G}$ and given $\mathcal{G}$ , the indicators $\mathbb{I}_{i,t}$ are conditionally independent by the definition of the coupling, we obtain

[TABLE]

Putting this together with equation (11), we get that

[TABLE]

This immediately implies that

[TABLE]

The sum of the indicators is at most $cn^{2}$ . We conclude that

[TABLE]

Since the definition of model 2 and model 3 is the same during the first $r-1$ steps, and we included all possible differences into the indicators, $\sum_{t=r-1}^{\lfloor cn^{2}\rfloor}\mathbb{I}_{i,t}\leq Z_{i}+1$ is an upper bound for $\sum_{j=1}^{n}|U_{ij}-V_{ij}|$ , where $U_{ij}$ is the number of edges between $i$ and $j$ in model $2$ , and $V_{ij}$ is the corresponding quantity in model $3$ (at the end of the whole process). By using Lemma 8 we get the statement of Proposition 2. $\square$

5.3. Models $3$ and $4$

Proof of Proposition 3

Let $U_{ij}$ be the number of edges between $i$ and $j$ in model $3$ , and $V_{ij}$ be the number of edges between them in model $4$ . By using the notations introduced for the coupling of the two models, we have $U_{ij}-V_{ij}=N_{\tau}^{(ij)}-N_{cn^{2}}^{(ij)}$ . If $\tau\geq cn^{2}$ , then all the differences are nonnegative, and all of them are negative if $\tau<cn^{2}$ . Thus

[TABLE]

We will use the fact that by cumulating the independent Poisson processes assigned to the pairs of vertices we get a Poisson process with rate $2\sum_{i<j}R_{i}^{*}R_{j}^{*}+\sum_{i}R_{i}^{*}=1$ . In addition, the types $(ij)$ of the events are independent of the moments when they occur. Let $N_{s}$ be the total number of events until time $s$ ; i.e. $N_{s}=\sum_{i\leq j}N_{s}^{(ij)}$ , which has Poisson distribution with parameter $s$ . Since there are $\lfloor cn^{2}\rfloor-r$ events in the cumulated process until $\tau$ , there are $|\lfloor cn^{2}\rfloor-r-N_{cn^{2}}|$ events between $\tau$ and $cn^{2}$ . On the other hand, independently of each other, all these events increase $\big{|}\sum_{j=1}^{n}N_{\tau}^{(ij)}-N_{cn^{2}}^{(ij)}\big{|}$ by $1$ with probability $p_{i}^{\prime}=R_{i}^{*}(R_{i}^{*}+2\sum_{j\neq i}R_{j}^{*})\leq 2R_{i}^{*}$ . We conclude that the quantity in equation (12) has binomial distribution with parameters $|\lfloor cn^{2}\rfloor-r-N_{cn^{2}}|$ and $p_{i}^{\prime}\leq 2R_{i}^{*}$ conditionally with respect to $N_{cn^{2}}$ and $(R_{j}^{*})_{j=1}^{n}$ . Let $F_{i}$ be the following event:

[TABLE]

By using the moment generating function of the binomial distribution, we obtain

[TABLE]

It follows from part $(b)$ of Lemma 9 and equation (9) that $\mathbb{P}(R_{i}^{*}>36\log n/n)=O(n^{-6})$ . Similarly to the proof of equation (9) in part $(e)$ of Proposition 11, it can be shown that $\mathbb{P}(r\geq 2n^{\alpha})=O(n^{-5})$ ; one can use Cramér’s large deviation theorem and the fact that the expectation of $r$ is smaller than $n^{\alpha}$ . Finally, recall that $N_{cn^{2}}$ has Poisson distribution with parameter $cn^{2}$ . We can think of it as the independent sum of $n^{2}$ Poisson random variables with parameter $c$ , and apply Cramér’s theorem. That is,

[TABLE]

where $M$ is the moment generating function of $\mathrm{Poisson}(c)$ , and we can choose $\vartheta$ to minimize the expression on the right hand side. By using $\log M(\vartheta)=c(e^{\vartheta}-1)$ and $\vartheta=\log(1+\log n/n)$ , it follows that this probability is also $O(n^{-6})$ . The same argument works for $\mathbb{P}(N_{cn^{2}}-cn^{2}<-n\log n)$ . On the other hand, $\alpha>1$ , hence $n^{\alpha}>n\log n$ for large $n$ .

Putting this together, we obtain that $\mathbb{P}(\overline{F_{i}})=O(n^{-6})$ , and

[TABLE]

Since the total sum cannot be larger than $cn^{2}$ , we get Proposition 3 similarly to the arguments in the previous section. $\square$

5.4. Models $4$ and $5$

Proof of Proposition 4

The expected value $\mathbb{E}\big{(}d_{\boxtimes}\big{(}\mathbb{G}_{4}(n,\alpha),\mathbb{G}_{5}(n)\big{)}\big{)}$ can be split according to the value of $r$ as follows.

[TABLE]

The second term is zero by the coupling, whilst the first is

[TABLE]

To bound this, note that we always have

[TABLE]

But we have by the definition of the variables $Z_{ij}$

[TABLE]

whence

[TABLE]

Since $r^{\prime}$ is, as noted before, the sum of $n$ independent geometric distributions of parameter $p_{\alpha}=1-e^{-\frac{1}{n^{\alpha-1}}}$ supported on $\mathbb{N}^{+}$ , we have

[TABLE]

Provided $\alpha<2$ , this yields $cn^{2}e^{-cn^{2-\alpha}}\leq O(n^{-10})$ .

5.5. Models $5$ and $6$

To be able to bound the jumble distance, we have to deal with each of the random variables $H_{ij}^{*}$ . Recall that $\mathcal{F}$ denoted the $\sigma$ -algebra generated by the $\xi_{i}$ and $R_{i}^{*}$ , $1\leq i\leq n$ . By our coupling we may write for each $1\leq i\leq j\leq n$

[TABLE]

Lemma 12.

Provided $\alpha\geq 1/2$ , we have for all non-negative integers $b\in\mathbb{N}_{0}$

[TABLE]

where $k^{(b)}$ denotes the $b^{th}$ factorial moment for any $k\in\mathbb{N}_{0}$ , i.e. $k^{(b)}=k(k-1)\ldots(k-b+1)$ .

Proof.

It is known that for any $b\in\mathbb{N}^{+}$ we have $\mathbb{E}\left(\mathrm{Pois}(\lambda)^{(b)}\right)=\lambda^{b}$ . Suppose now that $b\geq 1$ . By the law of total expectation, we have

[TABLE]

where

[TABLE]

and we made use of the power mean inequality in the form $(a_{1}+a_{2})^{b}\leq 2^{b-1}(a_{1}^{b}+a_{2}^{b})$ . Note that we may consider $F_{1}$ as the error that stems from the randomization in the denominator, whilst $F_{2}$ captures the error that comes from the rounding $\xi_{i}n^{\alpha-1}\to C_{i}$ .

Let us first bound $F_{1}$ . It is known that for the i.i.d. exponential variables $\xi_{i}$ , their sum $\sum\xi_{k}$ is independent from the ratios $\xi_{i}/\sum\xi_{k}$ . Hence

[TABLE]

Also, we have $\frac{\xi_{i}}{\sum_{k}\xi_{k}}\sim\mathrm{Beta}(1,n-1)$ . The first term can thus be bounded by

[TABLE]

We have that given $n$ i.i.d. random variables with expectation [math], and an integer $\nu\geq 2$ , the $\nu^{th}$ moment of their sum is bounded by $K^{2}n^{\nu/2}$ , with $K$ depending only on the distribution (see e.g. [1, 9]). In addition, $\sum_{k}\xi_{k}\sim\mathrm{Gamma}(n,1)$ . The second term can therefore be bounded by

[TABLE]

Thus we obtain

[TABLE]

For a fixed $b$ , this means

[TABLE]

Let us now turn to the term $F_{2}=\mathbb{E}\left(\left|\frac{C_{i}C_{j}}{(\sum_{k}C_{k})^{2}}-\frac{\xi_{i}\xi_{j}}{\sum_{k}\xi_{k}}\right|^{b}\right)$ . The first idea is to get rid of the absolute value by observing that if we have random variables $v_{1},v_{2},v_{3}$ such that $v_{1}\geq v_{2}\geq v_{3}$ and $v_{1}\geq 0\geq v_{3}$ , then for any $b\in\mathbb{N}^{+}$ we have

[TABLE]

The role of $v_{2}$ shall be played by $\frac{C_{i}C_{j}}{(\sum_{k}C_{k})^{2}}-\frac{\xi_{i}\xi_{j}}{\sum_{k}\xi_{k}}$ .

Using the fact that by the rounding, $n^{\alpha-1}\xi_{k}\leq C_{k}\leq n^{\alpha-1}\xi_{k}+1$ for each $1\leq k\leq n$ , we have

[TABLE]

and so we can have

[TABLE]

play the role of $v_{1}$ . Applying first the power mean inequality, and using that the reciprocal of the sum $\sum\xi_{k}$ has inverse gamma distribution, whilst the ratio is a $\mathrm{Beta}(1,n-1)$ distribution independent of it, for $n$ large enough, we obtain

[TABLE]

Again by the rounding, we have the lower bound

[TABLE]

Here it is clear that the last expression is negative, so let’s continue without the minus sign.

[TABLE]

So the role of $-v_{3}$ will be played by

[TABLE]

We use that the sum is independent of the proportions, use inequality (5.5), the Cauchy–Schwarz inequality and the moments of the Gamma distribution:

[TABLE]

Hence $(2c)^{b}n^{2b}F_{2}\leq\frac{K^{\prime\prime}_{b}}{n^{(\alpha-1)b}}$ , and summing up we obtain

[TABLE]

Proof of Proposition 6

Recall that in the coupling of model $5$ and $6$ , the absolute value of the difference of the number of edges between $i$ and $j$ is $H_{ij}^{*}$ . By Lemma 12 with $b=1$ , for some $K_{1}>0$ , for every fixed $i$ we have

[TABLE]

Let now $\varrho_{ij}:=H_{ij}^{*}\wedge 3$ , and $\sigma_{i}:=\sum_{j=1}^{n}\varrho_{ij}$ . Clearly we have

[TABLE]

For fixed $i$ , conditionally on $\mathcal{F}=\sigma\{\xi_{j},R_{j}^{*};1\leq j\leq n\}$ , the random variables $\varrho_{ij}$ ( $1\leq j\leq n$ ) are independent. Since they fall between [math] and $3$ , by the Hoeffding inequality we have

[TABLE]

for any $s\geq 0$ . Using the same constant $K_{1}$ as above, and choosing $s:=9\sqrt{n\log n}$ , we have by the bound on $m$ that

[TABLE]

A trivial bound then yields

[TABLE]

Since $\sigma_{i}\leq 3n$ always holds, we obtain

[TABLE]

It is clear that $H_{ij}^{*}\leq\varrho_{ij}+(H_{ij}^{*})^{(3)}$ , since whenever $H^{*}_{ij}>3$ , its $3$ rd factorial moment is positive, and strictly larger than $H_{ij}^{*}$ itself. Therefore

[TABLE]

From the above, together with inequality (13) :

[TABLE]

where the last inequality follows from a weighted AM-GM.

Finally, Lemma 8 concludes the proof. $\square$

5.6. Models $6$ and $7$

We have that $\mathbb{G}_{6}$ and $\mathbb{G}_{7}$ coincide everywhere but the main diagonal, and it is then easy to see that

[TABLE]

Proof of Propositon 7

Recall that $Y_{ii}$ has Poisson distribution with parameter $c\xi_{i}^{2}$ , where $\xi_{i}$ has $\exp(1)$ distribution. Assume first that $\zeta>0$ is fixed, and $X\sim\mathrm{Pois}(\zeta)$ . Then

[TABLE]

We will use the factorial moments of the Poisson distribution again. For every fixed $i$ and integers $y>b>0$ for some $K(b)>0$ we have

[TABLE]

because the exponential distribution has finite moments.

For an arbitrary function $f:\mathbb{N}^{+}\to\mathbb{N}^{+}$ we may apply the above inequality to obtain

[TABLE]

Let now $N\in\mathbb{N}^{+}$ be fixed, set $f(n):=n^{1/5}$ and $b:=4$ . For $n$ large enough (such that $n-3\geq n/2$ ) and $f(n)^{(4)}\geq f(n)^{4}/16$ , this yields

[TABLE]

Lemma 8 concludes the proof. $\square$

6. Discussion

Our main theorem shows that the classical dense preferential attachment graph model yields random graphs that are close to the random graph model obtained through the PAG-graphon, the limit object in the multigraph homomorphism sense of the random sequence $\mathbb{G}_{\mathrm{PAG}}$ . They are not indistinguishable though (we provide a lower bound on their distance below), and they each have their own advantages for applications.

The random graphs $\mathbb{G}_{\mathrm{PAG}}$ have the advantage that the number of edges is deterministic, but contrarily to the sparse PAG models, one cannot easily generate a growing family of graphs $\mathbb{G}_{\mathrm{PAG}}(n)$ . For the graphon induced $\mathbb{G}_{W}^{\circ}$ , the number of edges is random, though still asymptotically concentrated around the expected value. Also, the way it is generated does not carry the preferential attachment flavour. This may be an advantage from the simulation point of view: the random variables in the model can be generated simultaneously, without the $cn^{2}$ steps that have to be performed after each other in the PAG model.

However, it is possible to couple the elements of the sequence $\mathbb{G}_{W}^{\circ}(n)$ (or $\mathbb{G}_{W}(n)$ ) so that we obtain a growing sequence (and still keep the convergence with probability 1). Indeed, passing from $n$ to $n+1$ only means that we have to generate the random variable $\xi_{n+1}$ , independently of the previous $\xi_{i}$ -s, and then generate the appropriate Poisson random variables $Y_{j(n+1)}$ for $1\leq j\leq n+1$ . This coupling shows that adding an extra vertex and extending $\mathbb{G}_{W}^{\circ}(n)$ to $\mathbb{G}_{W}^{\circ}(n+1)$ can be performed easily. It seems that this does not hold for the $\mathbb{G}_{\rm PAG}$ model.

Unfortunately, we do not have a lower bound for the jumble norm distance of $\mathbb{G}_{\mathrm{PAG}}(n)$ and $\mathbb{G}_{W}(n)$ that matches the upper bound given in Theorem 1. Recall that we there obtained $O(n^{-1/3}\log^{2}n)$ as an upper bound for a particular coupling. On the other hand, there is a universal lower bound of $O(n^{-1})$ , which holds for every coupling, and also for both for the random graphs $\mathbb{G}_{W}(n)$ and $\mathbb{G}_{W}^{\circ}(n)$ . The exponents are quite far from each other, but the arguments used for the lower bound use very little of the structure of the graphs. We present a short argument giving this lower bound for both $\mathbb{G}_{W}^{\circ}(n)$ and $\mathbb{G}_{W}(n)$ .

If we take $S=T=\{1,2,\ldots,n\}$ in Definition 1, then we obtain a lower bound for the jumble norm distance of $\mathbb{G}_{\rm PAG}$ and $\mathbb{G}_{\rm W}$ by understanding the difference of the number of edges. The main point is that the distribution of this quantity does not depend on the coupling. In $\mathbb{G}_{\rm PAG}(n)$ , the number of edges is deterministic and it is equal to $\lfloor\lfloor cn^{2}\rfloor/2\rfloor$ . We denote by $\mathcal{E}$ the number of edges in the $\mathbb{G}_{\rm W}(n)$ graph model. Let $\mathcal{G}$ be the $\sigma$ -algebra generated by $\xi_{1},\ldots,\xi_{n}$ (recall that the latter random variables are independent and have exponential distribution with parameter $1$ ). Then, conditionally with respect to $\mathcal{G}$ , the random variable $\mathcal{E}$ has Poisson distribution with parameter $c\sum_{1\leq i<j\leq n}\xi_{i}\xi_{j}$ . Hence $\mathbb{E}(\mathcal{E})=cn(n-1)/2$ by the law of total expectation.

In any coupling of these two models, by $S=T=\{1,2,\ldots,n\}$ we have

[TABLE]

Notice that

[TABLE]

for an appropriate positive number $c_{0}$ . This holds for every coupling; therefore the exponent in Theorem 1 cannot be smaller than $-1$ .

The previous argument relies on the fact the expected number of edges is different in the two models, due to the lack of loops in the $\mathbb{G}_{W}$ model. For the $\mathbb{G}_{\mathrm{PAG}}$ and the $\mathbb{G}_{W}^{\circ}$ models, although the expected number of edges are equal to each other, one can prove that the jumble norm distance is still at least $\frac{1}{e^{2}}\sqrt{\frac{c}{2}}\cdot\frac{1}{n}$ for every coupling. The key point is to use the formula for the central absolute moment of the Poisson distribution and see that it is at least constant times the square root of the parameter.

To see this, we have to consider the random variable $\mathcal{E}^{\circ}$ , which is the number of edges in $\mathbb{G}_{W}^{\circ}$ . It has Poisson distribution with parameter $c\sum_{1\leq i<j\leq n}\xi_{i}\xi_{j}+\frac{c}{2}\sum_{i=1}^{n}\xi_{i}^{2}$ conditionally with respect to $\mathcal{G}$ (recall Definition 2). For sake of simplicity, let $\eta$ be a Poisson( $\lambda$ ) distributed random variable, and $m>0$ . First notice that

[TABLE]

On the other hand, by using the formula for the central absolute moment of the Poisson distribution and the well-known upper bound version of Stirling’s formula, we have

[TABLE]

Putting this together, we get

[TABLE]

Now we apply this for the conditional distribution of $\mathcal{E}^{\circ}$ with $m=\lfloor\lfloor cn^{2}\rfloor/2\rfloor$ . We obtain

[TABLE]

Therefore, since $m$ is the number of edges in the PAG model, we conclude that for every coupling of $\mathbb{G}_{\rm PAG}$ and $\mathbb{G}_{W}^{\circ}$ , we have

[TABLE]

Remark.

In this paper we considered the jumble distance between the two random models for the dense PAG graph, as that is the more natural distance notion for multigraphs generated by unbounded graphons (in this particular case, this corresponds to the unboundedness of the parameters of the Poisson distributions). However, as each finite multigraph generated is bounded per se, one may wonder if it is possible to say anything about the cut distance between, e.g., $\mathbb{G}_{\mathrm{PAG}}$ and $\mathbb{G}^{\circ}_{W}$ .

We recall that the cut distance of two graphs on the same set of $n$ vertices is defined as

[TABLE]

It is easily seen that $d_{\square}\leq d_{\boxtimes}$ , hence the upper bounds given for the jumble distance apply a fortiori to the cut distance as well. On the other hand, the methods used in this paper do not yield stronger bounds for the cut norm distance.

Acknowledgements

The first author was supported by the Hungarian National Research, Development and Innovation Office, NKFIH grant $\mathrm{n}^{\circ}$ K108615 and by the MTA Rényi Institute Lendület Limits of Structures Research Group. The second author has received funding from the European Research Council under the European Union’s Seventh Framework Programme (FP7/2007-2013) / ERC grant agreement $\mathrm{n}^{\circ}$ 617747, and from the MTA Rényi Institute Lendület Limits of Structures Research Group.

Bibliography13

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] B. von Bahr, On the convergence of moments in the central limit theorem, Ann. Math. Statist. 36 (1965), 808–818.
2[2] A.-L. Barabási and R. Albert, Emergence of scaling in random networks, Science 286 (1999), no. 5439, 509–512.
3[3] N. Berger, C. Borgs, J. T. Chayes, and A. Saberi Asymptotic behavior and distributional limits of preferential attachment graphs, Ann. Prob. 42 (2014), pp. 1–40.
4[4] C. Borgs, J. Chayes, L. Lovász, V. Sós, K. Vesztergombi. Limits of randomly grown graph sequences. Eur. J. Combin. 32(7) (2011), pp. 985–999.
5[5] R. Durrett, Random graph dynamics , Cambridge Series in Statistical and Probabilistic Mathematics, Cambridge Univ. Press, Cambridge, 2007.
6[6] R. Elwes, Preferential Attachment Processes Approaching The Rado Multigraph. ar Xiv preprint ar Xiv:1502.05618 (2015)
7[7] R. Elwes, A linear preferential attachment process approaching the Rado graph. ar Xiv preprint ar Xiv:1603.08806 (2016)
8[8] A. Frieze, M. Karoński, Introduction to random graphs , Cambridge University Press, Cambridge, 2015.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

On the dense Preferential Attachment Graph models and their graphon induced counterpart

Abstract.

Key words and phrases:

2010 Mathematics Subject Classification:

1. Introduction

2. Terminology and main result

Definition 1**.**

Definition 2**.**

Remark**.**

Definition 3**.**

Theorem 1**.**

3. Random graph models

Model 1

Model 2

Model 3

Model 4

Model 5

Model 6

Model 7

4. Couplings

Coupling of model 111 and model 222

Proposition 1**.**

Coupling of model 222 and model 333

Proposition 2**.**

Coupling of model 333 and model 444

Proposition 3**.**

Coupling of model 444 and model 555

Proposition 4**.**

Coupling of model 555 and model 666

Lemma 5**.**

Proof.

Proposition 6**.**

Coupling of model 666 and model 777

Proposition 7**.**

5. Proofs

Proof of Theorem 1

Lemma 8**.**

Proof.

5.1. Models 111 and 222

Proof of Propositon 1

5.2. Models 222 and 333

Lemma 9**.**

Proof.

Lemma 10**.**

Proof.

Proposition 11**.**

Proof.

5.3. Models 333 and 444

Proof of Proposition 3

5.4. Models 444 and 555

Proof of Proposition 4

5.5. Models 555 and 666

Lemma 12**.**

Proof.

Proof of Proposition 6

5.6. Models 666 and 777

Proof of Propositon 7

6. Discussion

Remark**.**

Acknowledgements

Definition 1.

Definition 2.

Remark.

Definition 3.

Theorem 1.

Coupling of model $1$ and model $2$

Proposition 1.

Coupling of model $2$ and model $3$

Proposition 2.

Coupling of model $3$ and model $4$

Proposition 3.

Coupling of model $4$ and model $5$

Proposition 4.

Coupling of model $5$ and model $6$

Lemma 5.

Proposition 6.

Coupling of model $6$ and model $7$

Proposition 7.

Lemma 8.

5.1. Models $1$ and $2$

5.2. Models $2$ and $3$

Lemma 9.

Lemma 10.

Proposition 11.

5.3. Models $3$ and $4$

5.4. Models $4$ and $5$

5.5. Models $5$ and $6$

Lemma 12.

5.6. Models $6$ and $7$

Remark.