Growth of Common Friends in a Preferential Attachment Model

Bikramjit Das; Souvik Ghosh

arXiv:1908.04510·math.PR·August 14, 2019

Growth of Common Friends in a Preferential Attachment Model

Bikramjit Das, Souvik Ghosh

PDF

TL;DR

This paper analyzes how the number of common friends grows in a preferential attachment model, revealing different growth regimes and providing estimates relevant for social network analysis.

Contribution

It derives the growth rate of common friends in a linear preferential attachment model and identifies phase transitions in their limiting behavior.

Findings

01

Growth rate of common friends varies with model parameters

02

Identifies power-law, logarithmic, and static growth regimes

03

Provides estimates for common friends in social networks

Abstract

The number of common friends (or connections) in a graph is a commonly used measure of proximity between two nodes. Such measures are used in link prediction algorithms and recommendation systems in large online social networks. We obtain the rate of growth of the number of common friends in a linear preferential attachment model. We apply our result to develop an estimate for the number of common friends. We also observe a phase transition in the limiting behavior of the number of common friends; depending on the range of the parameters of the model, the growth is either power-law, or, logarithmic, or static with the size of the graph.

Equations154

p_{i, n + 1} := P [v_{i} \leftrightarrow v_{n + 1} ∣ PA_{n}^{δ, C}] =

p_{i, n + 1} := P [v_{i} \leftrightarrow v_{n + 1} ∣ PA_{n}^{δ, C}] =

D_{i} (n + 1) = D_{i} (n) + Δ_{i} (n + 1)

D_{i} (n + 1) = D_{i} (n) + Δ_{i} (n + 1)

N_{ij} (n) := # friends common to i and j .

N_{ij} (n) := # friends common to i and j .

N_{ij} (n + 1) = N_{ij} (n) + 1_{B_{ij} (n + 1)}

N_{ij} (n + 1) = N_{ij} (n) + 1_{B_{ij} (n + 1)}

n \to \infty lim \frac{D _{i} ( n )}{n ^{γ}} \to D_{i} (\infty) a.s.,

n \to \infty lim \frac{D _{i} ( n )}{n ^{γ}} \to D_{i} (\infty) a.s.,

(1) n \to \infty lim N_{ij} (n) = N_{ij} (\infty) a.s.

(1) n \to \infty lim N_{ij} (n) = N_{ij} (\infty) a.s.

(2) n \to \infty lim \frac{N _{ij} ( n )}{lo g n} = \frac{C ( C - 1 )}{( 2 C + δ ) ^{2}} D_{i} (\infty) D_{j} (\infty) a.s.

(3) n \to \infty lim \frac{N _{ij} ( n )}{n ^{2 γ - 1} / ( 2 γ - 1 )} = \frac{C ( C - 1 )}{( 2 C + δ ) ^{2}} D_{i} (\infty) D_{j} (\infty) a.s.

(1) n \to \infty lim \frac{N _{ij} ( n )}{N _{ij} (⌊ n / k ⌋)} = 1 a.s.

(1) n \to \infty lim \frac{N _{ij} ( n )}{N _{ij} (⌊ n / k ⌋)} = 1 a.s.

(2) n \to \infty lim \frac{N _{ij} ( n )}{N _{ij} (⌊ n / k ⌋)} = 1 a.s.

(3) n \to \infty lim \frac{N _{ij} ( n )}{N _{ij} (⌊ n / k ⌋) k ^{2 γ - 1}} = 1 a.s.

\hat{N}^{k}_{ij}(n)=\left\{\begin{array}[]{ll}N_{ij}(\lfloor n/k\rfloor)&\mbox{if }\delta\geq 0\\ N_{ij}(\lfloor n/k\rfloor)k^{2\gamma-1}&\mbox{if }\delta<0.\end{array}\right.

\hat{N}^{k}_{ij}(n)=\left\{\begin{array}[]{ll}N_{ij}(\lfloor n/k\rfloor)&\mbox{if }\delta\geq 0\\ N_{ij}(\lfloor n/k\rfloor)k^{2\gamma-1}&\mbox{if }\delta<0.\end{array}\right.

X_{i} (n)

X_{i} (n)

Y_{ij} (n)

n \to \infty lim \frac{X _{i} ( n )}{n ^{γ}} \to D_{i} (\infty) a.s.,

n \to \infty lim \frac{X _{i} ( n )}{n ^{γ}} \to D_{i} (\infty) a.s.,

n \geq i sup \frac{E [ X _{i} ( n ) ^{k} ]}{n ^{k γ}} < \infty.

n \geq i sup \frac{E [ X _{i} ( n ) ^{k} ]}{n ^{k γ}} < \infty.

E [X_{i} (n + 1) ∣ PA_{n}^{δ, C}]

E [X_{i} (n + 1) ∣ PA_{n}^{δ, C}]

= (D_{i} (n) + δ) + C \cdot \frac{D _{i} ( n ) + δ}{( 2 C + δ ) n} = X_{i} (n) (1 + \frac{γ}{n}) .

E [X_{i} (n + 1)]

E [X_{i} (n + 1)]

= E [X_{i} (i)] (\frac{i + γ}{i}) (\frac{i + 1 + γ}{i + 1}) \dots (\frac{n + γ}{n})

= (C + δ) \frac{Γ ( i )}{Γ ( i + γ )} \frac{Γ ( n + 1 + γ )}{Γ ( n + 1 )} = (C + δ) \frac{Γ ( i )}{Γ ( i + γ )} (n + 1)^{γ} (1 + O (\frac{1}{n}))

\frac{Γ ( n + a )}{Γ ( n )} = n^{a} (1 + O (\frac{1}{n})) .

\frac{Γ ( n + a )}{Γ ( n )} = n^{a} (1 + O (\frac{1}{n})) .

\frac{E [ X _{i} ( n )]}{n ^{γ}} = (C + δ) \frac{Γ ( i )}{Γ ( i + γ )} (1 + O (\frac{1}{n})) < \infty.

\frac{E [ X _{i} ( n )]}{n ^{γ}} = (C + δ) \frac{Γ ( i )}{Γ ( i + γ )} (1 + O (\frac{1}{n})) < \infty.

n \geq i sup \frac{E [ X _{i} ( n ) ^{j} ]}{n ^{j γ}} \leq C_{j} < \infty.

n \geq i sup \frac{E [ X _{i} ( n ) ^{j} ]}{n ^{j γ}} \leq C_{j} < \infty.

E [X_{i} (n + 1)^{k}]

E [X_{i} (n + 1)^{k}]

\displaystyle=\mathbb{E}\left[\mathbb{E}\left[[X_{i}(n)+\Delta_{i}(n+1)]^{k}\Big{|}\text{PA}_{n}^{\delta,C}\right]\right]

\displaystyle=\mathbb{E}\left[\mathbb{E}\left[X_{i}(n)^{k}+kX_{i}(n)^{k-1}\Delta_{i}(n+1)+\binom{k}{2}X_{i}(n)^{k-2}\Delta_{i}(n+1)^{2}+\cdots\Big{|}\text{PA}_{n}^{\delta,C}\right]\right]

= E [X_{i} (n)^{k} + k X_{i} (n)^{k - 1} C p + (2 k) X_{i} (n)^{k - 2} (C p (1 - p) + (C p)^{2}) + \dots]

= E [X_{i} (n)^{k} + X_{i} (n)^{k} \frac{k γ}{n} + X_{i} (n)^{k} \frac{k ( k - 1 ) ( C - 1 )}{2 C} (\frac{γ}{n})^{2} + X_{i} (n)^{k - 1} \frac{k ( k - 1 )}{2} \frac{γ}{n} + \dots]

= E [X_{i} (n)^{k}] (1 + \frac{k γ}{n} + O (\frac{1}{n ^{2}})) + r = 1 \sum k - 1 E [X_{i} (n)^{k - r}] (α_{k - r} \frac{γ}{n} + O (\frac{1}{n ^{2}}))

\leq E [X_{i} (n)^{k}] (1 + \frac{k γ}{n} + O (\frac{1}{n ^{2}})) + r = 1 \sum k - 1 C_{k - r} n^{(k - r) γ} (α_{k - r} \frac{γ}{n} + O (\frac{1}{n ^{2}})) (using \eqref eq:boundCj)

\leq E [X_{i} (n)^{k}] (1 + \frac{k γ}{n} + O (\frac{1}{n ^{2}})) + C^{*} n^{(k - 1) γ - 1}

=: a_{n} E [X_{i} (n)^{k}] + b_{n}

E [X_{i} (n + 1)^{k}]

E [X_{i} (n + 1)^{k}]

r = ℓ + 1 \prod n a_{r}

r = ℓ + 1 \prod n a_{r}

= \frac{n ^{k γ}}{ℓ ^{k γ}} (1 + O (\frac{1}{ℓ})) (n > ℓ \to \infty) .

E [X_{i} (n + 1)^{k}]

E [X_{i} (n + 1)^{k}]

\leq A_{1} n^{k γ} (1 + O (\frac{1}{n})) + C^{*} n^{k γ} ℓ = i \sum n \frac{1}{ℓ ^{1 + γ}} (1 + O (\frac{1}{ℓ}))

\frac{E [ X _{i} ( n + 1 ) ^{k} ]}{( n + 1 ) ^{γ k}}

\frac{E [ X _{i} ( n + 1 ) ^{k} ]}{( n + 1 ) ^{γ k}}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Growth of Common Friends in a Preferential Attachment Model

Bikramjit Daslabel=e1][email protected] [

Souvik Ghoshlabel=e2][email protected] [ Singapore University of Technology and Design\thanksmarkm1 and LinkedIn\thanksmarkm2

Singapore University of Technology and Design

20 Dover Drive, Singapore 138682

LinkedIn Corporation, 700 E. Middlefield Road,

Mountain View, CA 94043, USA

Abstract

The number of common friends (or connections) in a graph is a commonly used measure of proximity between two nodes. Such measures are used in link prediction algorithms and recommendation systems in large online social networks. We obtain the rate of growth of the number of common friends in a linear preferential attachment model. We apply our result to develop an estimate for the number of common friends. We also observe a phase transition in the limiting behavior of the number of common friends; depending on the range of the parameters of the model, the growth is either power-law, or, logarithmic, or static with the size of the graph.

60F15,

60G42,

90B15,

91D30,

heavy-tail,

limit theorem,

link prediction,

preferential attachment,

social network,

keywords:

[class=AMS]

keywords:

\setattribute

journalname

and

T1The authors gratefully acknowledge support from MOE Tier 2 grant MOE2017-T2-2-161.

1 Introduction

Networks platforms like LinkedIn, Facebook, Instagram and Twitter form a big part of our culture. These networks have facilitated an increasing number of personal as well as professional interactions. The networking platforms strive to grow the network (graph) both in terms of the number of users (nodes) and the number of friendships or connections (edges) since a more densely connected user network typically results in a more engaged user base. The platforms often use recommendation systems like People You May Know (LinkedIn, Facebook) or Who to Follow (Twitter) (Gupta et al., 2013), that recommend individual users to connect with other users on the platform. Such recommendation systems look for signals that indicate that two individuals might know each other. For example, having a common friend between two users is a signal that they know each other. Furthermore, if two users have many friends in common then there is a high chance that they know each other. A generalization of this problem is that of link prediction in a network and is well-studied in the literature (Liben-Nowell and Kleinberg, 2007).

In this paper we establish the rate of growth of common friends for a fixed pair of nodes in a linear preferential attachment model, a commonly used generative graph model. The preferential attachment model, made popular in Barabási and Albert (1999), is a very well studied class of graph models. Studies have covered the behavior of degree sequence (Bollobás et al., 2001, Samorodnitsky et al., 2016, Resnick and Samorodnitsky, 2015), the maximal degrees in a graph, second-order degree sequences (size of network of friends of friends) (van der Hofstad, 2017, Section 8), generalizations to sublinear preferential attachment (Dereich and Mörters, 2009) and limiting structure of networks (Elwes, 2016). Other generalizations and extensions of these scale-free models have been studied in Cooper and Frieze (2003), Bollobás et al. (2003). See van der Hofstad (2017) for a nice overview and proper definitions of the models. To the best of our knowledge this is the first theoretical study of the number of common friends in a graph.

Two important observations follow from our result:

•

There is a phase transition in the asymptotic behavior of common friends. Depending on parameter values of the preferential attachment model, the number of common friends can exhibit a power-law or logarithmic growth or be static with the growth of the graph.

•

A corollary of our result is that we can use sampling techniques to estimate the common friends in a large network. This is helpful because computing the number of common friends for every pair of nodes in a graph is computationally expensive, especially for large networks with hundreds of millions of nodes and hundreds of billions of edges.

This paper is organized as follows. In Section 2 we describe the linear preferential attachment model we work with and state the main result. In Section 3 we show some simulated results providing intuition for our results. We provide the proof of the main result and some required supplementary results in Section 4. We conclude indicating future direction of work in Section 5.

2 Growth of Common Friends: Main Result

The model paradigm we work with is a version of the well-known undirected linear preferential attachment graph. The idea is that at every time instance when a new node comes to the network it creates $C$ independent edges and attaches to the previous nodes following a preferential attachment rule. The process is described as follows:

At any time $n\geq 1$ , the graph sequence is denoted $\text{PA}_{n}^{\delta,C}$ where $C\in\mathbb{N}^{*}=\{1,2,\ldots\}$ and $\delta>-C$ . Initially, the graph $\text{PA}_{1}^{\delta,C}$ has one node $v_{1}$ with $C$ self-loops. Then $\text{PA}_{n}^{\delta,C}$ evolves to $\text{PA}_{n+1}^{\delta,C}$ thus: at the $(n+1)^{\texttt{th}}$ stage, a new node named $v_{n+1}$ is added along with $C$ edges each of which has $v_{n+1}$ as one of its vertices, and the other vertex is selected from $V_{n}:=\{v_{1},v_{2},\ldots,v_{n}\}$ with probability proportional to the degree of the vertex (shifted by a parameter $\delta$ ) in $\text{PA}_{n}^{\delta,C}$ . For $1\leq i\leq n$ :

[TABLE]

Here $D_{i}(n):=$ degree of $v_{i}$ in $\text{PA}_{n}^{\delta,C}$ . The evolution of $D_{i}(n)$ occurs as:

[TABLE]

where $\Delta_{i}(n+1):=$ the number of stubs of $v_{n+1}$ (out of $C$ ) which attaches to $v_{i}$ . Moreover at any stage $n$ , for $1\leq i<j\leq n$ , call

[TABLE]

We ignore multi-edges when counting $N_{ij}(n)$ in the graph $\text{PA}_{n}^{\delta,C}$ , that is, $v_{k}$ counts as one common friend (vertex) between $v_{i}$ and $v_{j}$ in $\text{PA}_{n}^{\delta,C}$ for $1\leq i\neq j\neq k\leq n$ , if $v_{i}\leftrightarrow v_{k}$ and $v_{k}\leftrightarrow v_{j}$ regardless of their multiplicity. Our goal is to understand the behavior of $N_{ij}(n)$ for $1\leq i<j\leq n$ as $n$ becomes large. Observe that in our model the growth of $N_{ij}(n)$ occurs via the recurrence relation

[TABLE]

where $B_{ij}(n+1)$ is the event $\{v_{i}\leftrightarrow v_{n+1}\leftrightarrow v_{j}\}$ . Also, note that the possible range of parameters is $C\geq 2$ and $\delta>-C$ .

The power-law growth behavior for the degree distribution of a specific node in a linear preferential attachment model is well-known Bollobás et al. (2001),(van der Hofstad, 2017, Section 8).

Proposition 2.1.

For any fixed node $v_{i}$ , we have

[TABLE]

where $\gamma=\frac{C}{2C+\delta}$ , and $D_{i}(\infty)$ is a non-negative random variable with $\mathbb{E}[D_{i}(\infty)]=(C+\delta)\frac{\Gamma(i)}{\Gamma(i+\gamma)}$ .

Proposition 2.1 can be derived using arguments from (van der Hofstad, 2017, Proposition 8.2) or using similar arguments as in the proof of the Proposition 4.4 provided in Section 4.

Our main contribution is the following theorem which provides the growth rate of number of common friends of two nodes in such a model. The proof is given in Section 4.

Theorem 2.2.

Under the linear preferential attachment model, $(\text{PA}_{n}^{\delta,C})_{n\geq 1}$ , with $C\geq 2$ , for any two fixed nodes $v_{i},v_{j}$ , we have

[TABLE]

where $\gamma=\frac{C}{2C+\delta}$ , $\gamma_{1}=(1-\frac{1}{\sqrt{C}})\gamma$ , $\gamma_{2}=(1+\frac{1}{\sqrt{C}})\gamma$ . Furthermore, $\mathbb{E}[N_{ij}(\infty)]<\infty$ and $Y_{ij}(\infty)=D_{i}(\infty)D_{j}(\infty)$ with $\mathbb{E}[Y_{ij}(\infty)]=(C+\delta)^{2}\frac{\Gamma(i)\Gamma(j)\Gamma(j+\gamma)}{\Gamma(i+\gamma)\Gamma(j+\gamma_{1})\Gamma(j+\gamma_{2})}$ ; $D_{i}(\infty)$ is the limit of the scaled degree sequence of node $v_{i}$ as defined in Proposition 2.1.

Remark 2.3.

An interesting observation is the different regimes in the growth rate of number of common friends depending on the parameter $\delta$ .

When $\delta>0$ , we are in a regime that is mildly preferential attachment. In this regime, the nodes with low degree also get enough number of new friends. As $\delta$ increases, more nodes have a similar chance of being selected. Although the individual degrees for a fixed node grows like a power-law behavior, the number of common friends between two fixed nodes has a finite expectation even in the limit. 2. 2.

For $\delta\leq 0$ and especially $\delta$ closer to $-C$ , the new nodes prefer to friend nodes with a high degree. In this case the number of common friends tend to grow with the number of nodes, as a power-law for $\delta<0$ and at a logarithmic rate for $\delta=0$ .

Corollary 2.4.

Under the preferential attachment model, $(\text{PA}_{n}^{\delta,C})_{n\geq 1}$ , with $C\geq 2$ , for any two fixed nodes $v_{i},v_{j}$ and $k>1$ , we have

[TABLE]

Proof.

The result is an easy application of the almost sure convergences of $N_{ij}(n)$ observed in Theorem 2.2. ∎

The above corollary states that we can consistently estimate the number of common friends for a given pair of nodes using an earlier state of the graph, i.e., for any $k>1$

[TABLE]

For a large value of $k$ , the graph $\text{PA}_{n/k}^{\delta,C}$ can be significantly smaller than $\text{PA}_{n}^{\delta,C}$ and it is significantly cheaper to estimate the number of common friends.

3 Simulation Study

We illustrate the key idea behind Theorem 2.2 in the following simulated examples. Figure 1 shows instances of graphs simulated from the preferential attachment model with 20 nodes and $C=2$ ; left one with $\delta=-1.5$ and the right one with $\delta=1.5$ . When $\delta=-1.5$ we observe that the graph grows quite preferentially. New nodes tend to connect with the same few nodes and hence the number of common friends for them keep growing fast. When $\delta=1.5$ , the graph is more distributed. New nodes tend to connect with different nodes and hence the number of common friends does not grow so much.

To understand asymptotic property of the behavior of common friends, we simulate larger graphs and replicate the exercise multiple times. The left plot in Figure 2 is the histogram of number of common friends for two fixed nodes ( $v_{10}$ and $v_{20}$ ) for $\text{PA}^{-1.5,2}_{500}$ replicated 2500 times shows the heavy-tailed phenomenon. We also show trajectories of the number common friends for 5 arbitrary simulations as the size of the network grows in the right plot of Figure 2.

Figure 3 shows the behavior of common friends when $\delta=1.5$ . As expected, we see that common friends do not grow that fast in this case.

We also check the validity of Corollary 2.4 using simulations. Figure 4 provides simulation results for the estimator $\hat{N}^{k}_{ij}(n)$ for a $\text{PA}^{1.5,2}_{n}$ model. The first row shows the histogram of 500 simulations of $N_{ij}(n)/\hat{N}^{k}_{ij}(n)$ for $k=2$ and $n=1000,2000,4000$ . The second row shows the same for $k=4$ . We see the concentration of $N_{ij}(n)/\hat{N}^{k}_{ij}(n)$ near 1 as $n$ increases.

4 Proof of Main Result

In this section, we prove Theorem 2.2 by observing the asymptotic behavior of $D_{i}(n)$ , $(D_{i}(n),D_{j}(n))$ jointly and functions thereof. Recall that our preferential attachment graph sequence is $(\text{PA}_{n}^{\delta,C})_{n\geq 1}$ . We assume that $C\geq 2$ and $\delta>-C$ for all the results in this section. For convenience’s sake we use the following notations:

[TABLE]

From Proposition 2.1 we get

[TABLE]

where $\gamma=\frac{C}{2C+\delta}$ .

In the next few steps we apply Martingale Convergence Theorem to show almost sure convergence for the appropriately scaled sequences of random variables $\{Y_{ij}(n)\}$ . We also prove uniform integrability of the sequences $\{X_{i}(n)\}$ and $\{Y_{ij}(n)\}$ so that we additionally have convergence in $\mathcal{L}_{1}$ and can compute the expectation of the limit.

Lemma 4.1.

For any $k\geq 1$ and for a fixed $i$ , we have

[TABLE]

Proof.

We prove the result by induction on $k$ . We have $X_{i}(n+1)=X_{i}(n)+\Delta_{i}(n+1)$ and $\Delta_{i}(n+1)|\text{PA}_{n}^{\delta,C}\sim\text{Binomial}(C,p_{i,n+1})$ . Hence for $k=1$ with $1\leq i\leq n$ we get

[TABLE]

We also have $\mathbb{E}[X_{i}(i)]=D_{i}(i)+\delta=C+\delta.$ Now

[TABLE]

using Stirling’s formula (Abramowitz and Stegun, 2012) given by

[TABLE]

Hence for any $n\geq i$ ,

[TABLE]

Thus the result holds for $k=1$ . By induction hypothesis, let the result be true for $j=1,\ldots,k-1$ and we have constants $C_{j}$ such that

[TABLE]

Denoting $p=p_{i,n+1}=\frac{X_{i}(n)}{(2C+\delta)n}$ we get

[TABLE]

where $\alpha_{i}^{\prime}s$ and $C^{*}$ (appropriately chosen) are constants, and we denote $a_{n}=1+\frac{k\gamma}{n}+O\left(\frac{1}{n^{2}}\right)$ , $b_{n}=C^{*}n^{(k-1)\gamma-1}$ . Now using (4.4) recursively we get

[TABLE]

Using Sterling’s formula we have for any $\ell<n$ ,

[TABLE]

Therefore,

[TABLE]

where $A_{1}={(C+\delta)^{k}}/(a_{1}\ldots a_{i-1})$ . Hence dividing both sides by $(n+1)^{k\gamma}$ we get

[TABLE]

Hence the result holds for $j=k$ . ∎

Lemma 4.2.

For any $k\geq 1$ , we have for a fixed $i<j$ ,

[TABLE]

Proof.

This follows from Lemma 4.1 and the Cauchy-Schwarz inequality. ∎

Remark 4.3.

Since both sequences $\left\{{X_{i}(n)}/{n^{\gamma}}\right\}_{n\geq i}$ and $\left\{{Y_{ij}(n)}/{n^{\gamma}}\right\}_{n\geq j}$ are $\mathcal{L}^{k}$ bounded for some $k>1$ by Lemmas 4.1 and 4.2, they are also uniformly integrable; see (Durrett, 2019, Theorem 4.6.2).

The next Proposition 4.4 describes the asymptotic behavior of product of the degrees of two nodes, which as expected also has a power-law growth.

Proposition 4.4.

For any $i<j$ we have

[TABLE]

where $\mathbb{E}[Y_{ij}(\infty)]=(C+\delta)^{2}\frac{\Gamma(i)\Gamma(j)\Gamma(j+\gamma)}{\Gamma(i+\gamma)\Gamma(j+\gamma_{1})\Gamma(j+\gamma_{2})}=:C_{ij}$ with $\gamma=\frac{C}{2C+\delta}$ , $\gamma_{1}=(1-\frac{1}{\sqrt{C}})\gamma$ , $\gamma_{2}=(1+\frac{1}{\sqrt{C}})\gamma$ . Here $D_{i}(\infty),D_{j}(\infty)$ are as defined in Proposition 2.1.

Proof.

Note that

[TABLE]

and writing $p_{i}=p_{i,n+1},p_{j}=p_{j,n+1}$ we have

[TABLE]

where $\gamma=\frac{C}{2C+\delta},\gamma_{1}=(1-\frac{1}{\sqrt{C}})\gamma,\gamma_{2}=(1+\frac{1}{\sqrt{C}})\gamma$ . Moreover, for $i<j$ we have,

[TABLE]

Therefore,

[TABLE]

Define

[TABLE]

using Sterling’s formula. Hence by Lemma 4.2, $\{W_{ij}(n)\}_{j\geq n}$ is uniformly integrable. Moreover $W_{ij}(n)\geq 0,\mathbb{E}W_{ij}(n)=1,$ and $\mathbb{E}[W_{ij}(n+1)|W_{ij}(n)]=W_{ij}(n)$ for $1\leq i\neq j\leq n$ . Hence by Doob’s Martinagale Convergence Theorem (Durrett, 2019, Theorem 4.2.11 and Theorem 4.6.4)

[TABLE]

where $W_{ij}(\infty):=\limsup_{n}W_{ij}(n)$ and $\mathbb{E}W_{ij}(\infty)<\infty$ . Hence we have

[TABLE]

both almost surely and in $\mathcal{L}^{1}$ with $\mathbb{E}(Y_{ij}(\infty))=C_{ij}$ . From Proposition 2.1 and (4.1) we can check that $n\to\infty$ , ${Y_{ij}(n)}/{n^{2\gamma}}\to D_{i}(\infty)D_{j}(\infty)$ a.s.; hence $Y_{ij}(\infty)=D_{i}(\infty)D_{j}(\infty)$ a.s. ∎

Lemma 4.5.

For any $i<j$ , we have

[TABLE]

where $Y_{ij}(\infty)$ is as defined in Proposition 4.4.

Proof.

Let all our random variables be defined on the probability space $(\Omega,\mathcal{A},\mathbb{P})$ . From Proposition 4.4, for $\omega\in\Omega$ ,

[TABLE]

holds with probability 1. Fix such an $\omega\in\Omega$ . Then given any small $\epsilon>0$ , there exists $n_{0}\in\mathbb{N}^{*}$ such that for any $n\geq n_{0}$ ,

[TABLE]

(1) If $\delta=0$ , we have $\gamma=\frac{1}{2}$ . Also for $n\to\infty$ , we have $\displaystyle{\sum_{k=1}^{n}\frac{1}{k^{2-2\gamma}}=\sum_{k=1}^{n}\frac{1}{k}\sim\log(n)}$ . Hence

[TABLE]

Check that as $n\to\infty$ , ${\rm{I}}_{n}\to 0,{\rm{III}}_{n}\to 0$ . Since $\gamma=\frac{1}{2}$ using (4.7) we get

[TABLE]

Therefore we have,

[TABLE]

By Proposition 4.4, (4.6) holds almost surely implying (4.8) holds almost surely and hence Lemma 4.5(1) holds.

(2) For $\delta<0$ , we get $\frac{1}{2}<\gamma=\frac{C}{2C+\delta}<1$ which means $0<2-2\gamma<1$ . Note that for any $0<\alpha<1$ , we have

[TABLE]

Now we can prove Lemma 4.5(2) in the same manner as we proved (1). ∎

Lemma 4.6.

For any $i<j$ , we have

[TABLE]

where $Y_{ij}(\infty)$ is as defined in Proposition 4.4.

Proof.

Define

[TABLE]

Note that for $0<\gamma<1$ ,

[TABLE]

which holds a.s. using Propositions 2.1 and 4.4. Now we can proceed to prove the statements using the same arguments as in Lemma 4.5 by replacing $Y_{ij}$ with $Y_{ij}^{*}$ . ∎

With the aid of all the results above we are in a position to prove Theorem 2.2.

Proof of Theorem 2.2.

Using (2.3) recursively we have

[TABLE]

For any $1\leq i<j\leq n$ with $C\geq 2$ ,

[TABLE]

We can check that,

[TABLE]

holding with equality for $C=2$ and

[TABLE]

Proof of part (1). First we prove the case when $\delta>0$ . Clearly as $n\to\infty$ , $N_{ij}(n+1)\uparrow N_{ij}(\infty)$ where

[TABLE]

We want to show that $N_{ij}(\infty)<\infty$ a.s.. Taking expectations in (4.10) and using (4.5) we get

[TABLE]

Applying this argument recursively we get

[TABLE]

where $\tilde{C}=\frac{C(C-1)C_{ij}}{(2C+\delta)^{2}}$ . Therefore for any $n$ we have

[TABLE]

since $2-(\gamma_{1}+\gamma_{2})=2-2\gamma=1+\frac{\delta}{2C+\delta}>1$ for $\delta>0$ . Since the right hand side in (4.12) does not depend on $n$ , $\sum_{k=j+1}^{\infty}\mathbb{P}({B_{ij}(k)})<\infty$ . Using Borel-Cantelli Lemma (Durrett, 2019, Theorem 2.3.1) this implies

[TABLE]

and hence $\sum_{k=j+1}^{\infty}\boldsymbol{1}_{B_{ij}(n+1)}<\infty$ a.s. and since $N_{{ij}}(j)\leq\max(j-2,C)$ we have

[TABLE]

Proof of part (2). Here we address the case where $\delta=0$ . Define

[TABLE]

Using the conditional Borel-Cantelli Lemma (Durrett, 2019, Theorem 4.4.5) we have

[TABLE]

Note that, using (4.10) and (4.11), we have for $i<j<n$ ,

[TABLE]

Using the above recursively we obtain

[TABLE]

Now, from Lemmas 4.5(1) and 4.6(1) we have

[TABLE]

Therefore we get

[TABLE]

Hence from (4.13) and (4.14) we have

[TABLE]

Proof of part (3). The case where $\delta<0$ can be shown using the same technique as for $\delta=0$ by using Lemmas 4.5(2) and 4.6(2) in place of Lemmas 4.5(1) and 4.6(1). ∎

5 Conclusion

In this paper we establish the rate of growth of the number of common friends for two fixed nodes in a linear preferential attachment model. The growth rate is shown to be static, logarithmic or power-law type depending on the choice of the parameter- $\delta>0,\delta=0$ or $\delta<0$ respectively. We use this result to prove consistency of an estimator of the number of common friends that is less expensive to compute. Such results will be applicable in both link prediction problems for large dynamic networks as well as detection methods for a preferential attachment model.

This is the first step in showing a more general result regarding the growth behavior for common friends of any randomly chosen pair of nodes and obtaining uniform convergence bounds for estimators of common friends. Further properties of such models and estimation issues are under current investigation.

6 Acknowledgement

The authors are very grateful to the referee for insightful comments and also for providing us with precise ideas to fill gaps in parts of the proof of Theorem 2.1.

Bibliography13

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Abramowitz and Stegun (2012) {bbook} [author] \bauthor \bsnm Abramowitz, \bfnm M. \binits M. and \bauthor \bsnm Stegun, \bfnm I. A. \binits I. A. ( \byear 2012). \btitle Handbook of Mathematical Functions: with Formulas, Graphs, and Mathematical Tables. \bpublisher Courier Corporation. \endbibitem
2Barabási and Albert (1999) {barticle} [author] \bauthor \bsnm Barabási, \bfnm A. \binits A. and \bauthor \bsnm Albert, \bfnm R. \binits R. ( \byear 1999). \btitle Emergence of scaling in random network. \bjournal Science \bvolume 286 \bpages 509-512. \endbibitem
3Bollobás et al. (2001) {barticle} [author] \bauthor \bsnm Bollobás, \bfnm B. \binits B., \bauthor \bsnm Riordan, \bfnm O. \binits O., \bauthor \bsnm Spencer, \bfnm J. \binits J. and \bauthor \bsnm Tusnády, \bfnm G. \binits G. ( \byear 2001). \btitle The degree sequence of a scale-free random graph process. \bjournal Random Structures Algorithms \bvolume 18 \bpages 279–290. \endbibitem
4Bollobás et al. (2003) {binproceedings} [author] \bauthor \bsnm Bollobás, \bfnm B. \binits B., \bauthor \bsnm Borgs, \bfnm C. \binits C., \bauthor \bsnm Chayes, \bfnm J. \binits J. and \bauthor \bsnm Riordan, \bfnm O. \binits O. ( \byear 2003). \btitle Directed scale-free graphs. In \bbooktitle Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms (Baltimore, 2003) \bpages 132-139. \endbibitem
5Cooper and Frieze (2003) {barticle} [author] \bauthor \bsnm Cooper, \bfnm C. \binits C. and \bauthor \bsnm Frieze, \bfnm A. \binits A. ( \byear 2003). \btitle A general model of web graphs. \bjournal Random Structures & Algorithms \bvolume 22 \bpages 311–335. \endbibitem
6Dereich and Mörters (2009) {barticle} [author] \bauthor \bsnm Dereich, \bfnm S. \binits S. and \bauthor \bsnm Mörters, \bfnm P. \binits P. ( \byear 2009). \btitle Random networks with sublinear preferential attachment: Degree evolutions. \bjournal Electronic Journal of Probability \bvolume 43 \bpages 1222-1267. \endbibitem
7Durrett (2019) {bbook} [author] \bauthor \bsnm Durrett, \bfnm R. T. \binits R. T. ( \byear 2019). \btitle Probability: Theory and Examples, \bedition fifth ed. \bseries Cambridge Series in Statistical and Probabilistic Mathematics \bvolume 49. \bpublisher Cambridge University Press, Cambridge. \endbibitem
8Elwes (2016) {barticle} [author] \bauthor \bsnm Elwes, \bfnm R. \binits R. ( \byear 2016). \btitle A Linear Preferential Attachment Process Approaching the Rado Graph. \bjournal http://arxiv.org/abs/1603.08806 v 2. \endbibitem

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Growth of Common Friends in a Preferential Attachment Model

Abstract

keywords:

keywords:

1 Introduction

2 Growth of Common Friends: Main Result

Proposition 2.1**.**

Theorem 2.2**.**

Remark 2.3**.**

Corollary 2.4**.**

Proof.

3 Simulation Study

4 Proof of Main Result

Lemma 4.1**.**

Proof.

Lemma 4.2**.**

Proof.

Remark 4.3**.**

Proposition 4.4**.**

Proof.

Lemma 4.5**.**

Proof.

Lemma 4.6**.**

Proof.

Proof of Theorem 2.2.

5 Conclusion

6 Acknowledgement

Proposition 2.1.

Theorem 2.2.

Remark 2.3.

Corollary 2.4.

Lemma 4.1.

Lemma 4.2.

Remark 4.3.

Proposition 4.4.

Lemma 4.5.

Lemma 4.6.