Modularity of complex networks models
Liudmila Ostroumova Prokhorenkova, Pawel Pralat, Andrei Raigorodskii

TL;DR
This paper examines the concept of modularity in complex networks, comparing spatial and non-spatial models, providing theoretical insights, and discussing implications for community detection and model selection.
Contribution
It offers theoretical results for classical and preferential attachment models and contrasts them with spatial models, enhancing understanding of modularity in different network types.
Findings
Classical random d-regular graphs have low modularity.
Spatial preferential attachment models naturally produce high modularity.
Results aid in statistical testing and model selection for network clustering.
Abstract
Modularity is designed to measure the strength of division of a network into clusters (known also as communities). Networks with high modularity have dense connections between the vertices within clusters but sparse connections between vertices of different clusters. As a result, modularity is often used in optimization methods for detecting community structure in networks, and so it is an important graph parameter from a practical point of view. Unfortunately, many existing non-spatial models of complex networks do not generate graphs with high modularity; on the other hand, spatial models naturally create clusters. We investigate this phenomenon by considering a few examples from both sub-classes. We prove precise theoretical results for the classical model of random d-regular graphs as well as the preferential attachment model, and contrast these results with the ones for the spatial…
| 3 | 0.9386 | 0.8771 | 0.8038 |
|---|---|---|---|
| 4 | 0.8900 | 0.7800 | 0.6834 |
| 5 | 0.8539 | 0.7078 | 0.6024 |
| 6 | 0.8261 | 0.6521 | 0.5435 |
| 7 | 0.8038 | 0.6076 | 0.4984 |
| 8 | 0.7855 | 0.5710 | 0.4624 |
| 9 | 0.7702 | 0.5403 | 0.4330 |
| 10 | 0.7570 | 0.5140 | 0.4083 |
| 0.142 | 0.125 | 0.111 | 0.100 | 0.0100 | 0.0010 | |
| 0.156 | 0.136 | 0.136 | 0.123 | 0.0397 | 0.0126 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Modularity of complex networks models
Liudmila Ostroumova Prokhorenkova1,2
Paweł Prałat3,4
Andrei Raigorodskii1,2,5,6
(1Moscow Institute of Physics and Technology, Moscow, Russia
2Yandex, Moscow, Russia
3Ryerson University, Toronto, ON, Canada
4The Fields Institute for Research in Mathematical Sciences, Toronto, ON, Canada
5Moscow State University, Moscow, Russia
6Buryat State Unversity, Ulan-Ude, Buryat Republic, Russia )
Abstract
Modularity is designed to measure the strength of division of a network into clusters (known also as communities). Networks with high modularity have dense connections between the vertices within clusters but sparse connections between vertices of different clusters. As a result, modularity is often used in optimization methods for detecting community structure in networks, and so it is an important graph parameter from a practical point of view. Unfortunately, many existing non-spatial models of complex networks do not generate graphs with high modularity; on the other hand, spatial models naturally create clusters. We investigate this phenomenon by considering a few examples from both sub-classes. We prove precise theoretical results for the classical model of random -regular graphs as well as the preferential attachment model, and contrast these results with the ones for the spatial preferential attachment (SPA) model that is a model for complex networks in which vertices are embedded in a metric space, and each vertex has a sphere of influence whose size increases if the vertex gains an in-link, and otherwise decreases with time. The results obtained in this paper can be used for developing statistical tests for models selection and to measure statistical significance of clusters observed in complex networks.
1 Introduction
Many social, biological, and information systems can be represented by networks, whose vertices are items and links are relations between these items [2, 7, 9, 16]. That is why the evolution of complex networks attracted a lot of attention in recent years and there has been a great deal of interest in modelling of these networks [12, 20, 42]. The hyperlinked structure of the Web, citation patterns, friendship relationships, infectious disease spread are seemingly disparate linked data sets which have fundamentally very similar natures. Indeed, it turns out that many real-world networks have some typical properties: heavy tailed degree distribution, small diameter, high clustering coefficient, and others [39, 41, 47]. Such properties are well-studied both in real-world networks and in many theoretical models.
Another important property of complex networks is their community structure, that is, the organization of vertices in clusters, with many edges joining vertices of the same cluster and comparatively few edges joining vertices of different clusters [24, 28]. In social networks communities may represent groups by interest, in citation networks they correspond to related papers, in the Web communities are formed by pages on related topics, etc. Being able to identify communities in a network could help us to exploit this network more effectively. For example, clusters in citation graphs may help to find similar scientific papers, discovering users with similar interests is important for targeted advertisement, clustering can also be used for network compression and visualization.
The key ingredient for many clustering algorithms is modularity, which is at the same time a global criterion to define communities, a quality function of community detection algorithms, and a way to measure the presence of community structure in a network. Modularity was introduced by Newman and Girvan [43] and it is based on the comparison between the actual density of edges inside a community and the density one would expect to have if the vertices of the graph were attached at random, regardless of community structure.
Unfortunately, modularity is not a well studied parameter for the existing random graph models, at least from a rigorous, theoretical point of view. We are only aware about results for binomial random graphs and random -regular graphs (see Section 2.3 for more details). In this paper, we continue investigating random -regular graphs and obtain new upper bounds for their modularity. Then we move to the preferential attachment model, introduced by Barabási and Albert [8], which is probably the most well-studied model of complex networks. For this model no results on modularity are known and we obtain both lower and upper bounds. In fact, one of the lower bound we present holds for all graphs with average degree and sublinear maximum degree.
As expected, the models discussed above, as well as many others, have a common weakness of low modularity. One family of models which overcomes this deficiency is the family of spatial (or geometric) models, wherein the vertices are embedded in a metric space such that similar vertices are closer to each other than dissimilar ones. The underlying geometry of spatial models naturally leads to the emergence of clusters. We prove this statement rigorously for one example of a geometric model, the Spatial Preferential Attachment model introduced in [1].
This paper is a journal version of [44] and is structured as follows. In the next section, we formally define modularity, discuss several random graph models and present known results on modularity in these models. In Sections 3, 5 and 6 we analyze modularity in random -regular graphs, preferential attachment and SPA models, respectively. In Section 4 we discuss lower bounds for modularity of forests and constant average degree graphs. Section 7 concludes the paper and outlines the directions for future research.
2 Preliminaries
2.1 Modularity
The definition of modularity was first introduced by Newman and Girvan in [43]. Since then, many popular and applied algorithms used to find clusters in large data-sets are based on finding partitions with high modularity [18, 34, 40]. The modularity function favours partitions in which a large proportion of the edges fall entirely within the parts and biases against having too few or too unequally sized parts. Formally, for a given partition of the vertex set , let
[TABLE]
where is the number of edges in the graph induced by the set . The first term, , is called the edge contribution, whereas the second one, , is called the degree tax. It is easy to see that is always smaller than one. Also, if , then .
The modularity is defined as the maximum of over all possible partitions of ; that is,
[TABLE]
In order to maximize one wants to find a partition with large edge contribution subject to small degree tax. If approaches 1 (which is the maximum), we observe a strong community structure; conversely, if is close to zero, we are given a graph with no community structure.
Modularity is known to have some weaknesses, as discussed in [24]. For example, [25] shows that this measure fails to detect communities if their sizes are too small. However, despite this, modularity still remains to be the most popular measure used by many well known clustering algorithms [18, 34, 40].
2.2 Random graph models
Random -regular graphs.
We consider the probability space of random -regular graphs with uniform probability distribution. This space is denoted , and asymptotics are for with fixed, and even if is odd.
We say that an event in a probability space holds asymptotically almost surely (or a.a.s.) if the probability that it holds tends to as goes to infinity. Since we aim for results that hold a.a.s., we will always assume that is large enough.
Preferential Attachment.
The Preferential Attachment (PA) model [8] was an early stochastic model of complex networks. We will use the following precise definition of the model, as considered by Bollobás and Riordan in [13] as well as Bollobás, Riordan, Spencer, and Tusnády [14].
Let be the null graph with no vertices (or let be the graph with one vertex, , and one loop). The random graph process is defined inductively as follows. Given , we form by adding a vertex together with a single edge between and , where is selected randomly with the following probability distribution:
[TABLE]
where denotes the degree of in (loops are counted twice). In other words, at -th step of the process we send an edge from to a random vertex , where the probability that a vertex is chosen is proportional to its current degree, counting as already contributing one to the degree of .
For , the process is defined similarly with the only difference that edges are added to to form (one at a time), counting previous edges as already contributing to the degree distribution. Equivalently, one can define the process by considering the process on a sequence of vertices; the graph is formed from by identifying vertices to form , identifying vertices to form , and so on. Note that in this model is in general a multigraph, possibly with multiple edges between two vertices (if ) and self-loops.
It was shown in [14] that for any a.a.s. the degree distribution of follows a power law: the number of vertices with degree at least falls off as for some explicit constant and large . Also, in the case , each vertex sends an edge either to itself or to an earlier vertex, so is a forest with each component containing a single looped vertex. The expected number of components is then and, since events are independent, we derive that a.a.s. there are components in by Chernoff’s bound. In contrast, for the case it is known that a.a.s. is connected and its diameter is [13].
Spatial Preferential Attachment.
The Spatial Preferential Attachment (SPA) model [1], designed as a model for the World Wide Web, combines geometry and preferential attachment, as its name suggests. Setting the SPA model apart is the incorporation of ‘spheres of influence’ to accomplish preferential attachment: the greater the degree of a vertex, the larger its sphere of influence, and hence the higher the likelihood of the vertex gaining more neighbours.
We now give a precise description of the SPA model. Let be the unit hypercube in , equipped with the torus metric derived from any of the norms. This means that for any two points and in ,
[TABLE]
The torus metric thus ‘wraps around’ the boundaries of the unit square; this metric was chosen to eliminate boundary effects. The parameters of the model consist of the link probability , and two positive constants and , which, in order to avoid the resulting graph becoming too dense, must be chosen so that . The SPA model generates stochastic sequences of directed graphs , where , and . Let be the in-degree of the vertex in , and its out-degree. We define the sphere of influence of the vertex at time to be the ball centered at with volume defined as follows:
[TABLE]
The process begins at , with being the null graph. Time step , , is defined to be the transition between and . At the beginning of each time step , a new vertex is chosen uniformly at random from , and added to to create . Next, independently, for each vertex such that , a directed link is created with probability . Thus, the probability that a link is added in time-step equals .
The SPA model produces scale-free networks, which exhibit many of the characteristics of real-life networks (see [1, 19]). In [31], it was shown that the SPA model gave the best fit, in terms of graph structure, for a series of social networks derived from Facebook. In [32], some properties of common neighbors were used to explore the underlying geometry of the SPA model and quantify vertex similarity based on distance in the space. However, the distribution of vertices in space was assumed to be uniform [32] and so in [33] non-uniform distributions were investigated which is clearly a more realistic setting.
2.3 Previous results on modularity
In this section we discuss known bounds for modularity in different random graph models.
The isoperimetric number (known also as edge expansion) of a graph is defined as
[TABLE]
where is the number of edges between the sets and . The following result was shown by McDiarmid and Skerman in [35]. Let be any -regular graph on vertices. Then, the following useful upper bound on the modularity is almost immediate:
[TABLE]
Turning to random -regular graphs, Bollobás in [11] showed that a.a.s. , where is such that and so a.a.s.
[TABLE]
As a result, we get the first non-trivial upper bounds for presented in Table 1 that hold a.a.s.
In [35], the bound (3) was slightly improved when the maximum size of parts in our partition is restricted. Formally, given , for a graph with vertices, they define to be the maximum modularity of all partitions for such that each part has size at most . They show that for any there exists such any -regular graph with at least vertices satisfies
[TABLE]
Again, using the result of Bollobás we get that there exists such that
[TABLE]
serves as an upper bound that holds a.a.s. for ; again, see Table 1 for numerical values for small values of . It is straightforward to see that (see, for example, [11]) and so, in particular, can be made arbitrarily small by taking large enough (and small enough). However, let us note that these upper bounds for , while useful, cannot be directly translated into any bound for .
Investigating random -regular graphs continues in [36], a very recent paper. In fact, the numerical upper bound presented in Section 3.3, as well as the result in Theorem 4, are obtained independently there. Moreover, [36] investigates the class of graphs whose product of treewidth and maximum degree is much less than the number of edges. Their result shows, for example, that random planar graphs typically have modularity close to 1, which is another indication that clusters naturally emerge where geometry is included. Also, a particular case of their theorem shows that trees with maximum degree have asymptotic modularity one.
3 Random -regular graphs
3.1 Pairing model
Instead of working directly in the uniform probability space of random regular graphs on vertices , we use the pairing model (also known as the configuration model) of random regular graphs, first introduced by Bollobás [10], which is described next. Suppose that is even, as in the case of random regular graphs, and consider points partitioned into labelled buckets of points each. A pairing of these points is a perfect matching into pairs. Given a pairing , we may construct a multigraph , with loops allowed, as follows: the vertices are the buckets , and a pair in corresponds to an edge in if and are contained in the buckets and , respectively. It is an easy fact that the probability of a random pairing corresponding to a given simple graph is independent of the graph, hence the restriction of the probability space of random pairings to simple graphs is precisely . Moreover, it is well known that a random pairing generates a simple graph with probability asymptotic to depending on , so that any event holding a.a.s. over the probability space of random pairings also holds a.a.s. over the corresponding space . For this reason, asymptotic results over random pairings suffice for our purposes. For more information on this model, see, for example, the survey of Wormald [48].
3.2 Lower bound
For completeness, let us briefly discuss the following known lower bound for the modularity of . It is known that a.a.s. for any , is Hamiltonian. As pointed out in [35], one can use this fact to partition the graph such that it breaks the cycle into paths of length at most . For this particular partition the edge contribution is and the degree tax is . It follows then that a.a.s.
[TABLE]
(Our more general lower bound that holds for graphs with average degree implies the same—see Theorem 6 for more.) Whereas this trivial lower bound could be sharp for it is definitely not the case for large . As pointed out in [36], there exists a universal constant such that a.a.s. .
3.3 Numerical upper bound
The following straightforward lemma is useful for obtaining upper bounds for modularity of random -regular graphs.
Lemma 1
Consider any -regular graph on vertices . If no subset of of size induces edges with , then .
*Proof. * For a given partition of the vertex set , let and ; that is, set has vertices and induces edges. Then, taking into account the fact that for any we have , we can rewrite (1) as
[TABLE]
As it is simply a weighted average, would imply that there exists some set of size that induces edges, and . So, the proof of the lemma is finished.
To formulate the main theorem of this section, we need the following notation. For a given , let
[TABLE]
It will be clear once we establish the connection between function and random -regular graphs, but it is straightforward to see that for any we have (more precisely, its limit value) and for some . Indeed, for example note that . Also, it is easy to see that is continuous on .
Finally, let be largest value of such that ; in particular, for any .
Theorem 2
Let and be an arbitrarily small constant. Then a.a.s.
[TABLE]
where
[TABLE]
As usual, see Table 1 for numerical values for small values of .
*Proof. * We prove below that the following property holds a.a.s. for . No set of size (for any ) induces a graph with edges, where and is defined as above. Then Theorem 2 follows directly from Lemma 1.
Consider for some and let be an arbitrarily small constant. Our goal is to show that the expected number of sets such that and with is . (For simplicity, we do not round numbers that are supposed to be integers either up or down; this is justified since these rounding errors are negligible to the asymptomatic calculations we will make.) This, together with the first moment principle, implies that a.a.s. no such set exists for any and (as there are possible sizes of and possible values of that we need to consider).
Let and be any functions of such that and . Let be the expected number of sets such that and . Using the pairing model, it is clear that
[TABLE]
where is the number of pairings of vertices, that is,
[TABLE]
(Each time we deal with pairings, is assumed to be an even number.) After simplification we get
[TABLE]
Using Stirling’s formula () and focusing on the exponential part we obtain
[TABLE]
where is defined in (5). It follows immediately from the definition of that is bounded away from zero for any pairs of integers and under consideration, and so for any pair we get and the proof is finished.
3.4 Explicit but weaker upper bound
Theorem 2 provides an upper bound that can be easily numerically computed for a given . Next, we present a slightly weaker but an explicit bound that can be obtained using the expansion properties of random -regular graphs that follow from their eigenvalues. In particular, it will imply that a.a.s. and so as .
The adjacency matrix of a given -regular graph with vertices, is an real and symmetric matrix. Thus, the matrix has real eigenvalues which we denote by . It is known that certain properties of a -regular graph are reflected in its spectrum but, since we focus on expansion properties, we are particularly interested in the following quantity: . In words, is the largest absolute value of an eigenvalue other than (for more details, see the general survey [29] about expanders, or [6], Chapter 9).
The value of for random -regular graphs has been studied extensively. A major result due to Friedman [26] is the following:
Lemma 3** ([26])**
For every fixed and for , a.a.s.
[TABLE]
We prove the following theorem.
Theorem 4
Let . Then, for any -regular graph we have
[TABLE]
In particular, for random -regular graphs a.a.s.
[TABLE]
*Proof. * The second part of the theorem follows from Lemma 3, as for a random -regular graphs a.a.s. for sufficiently small . Let us now show that .
The number of edges between sets and is expected to be close to the expected number of edges between and in a random graph of edge density , namely . A small (or large spectral gap) implies that this deviation is small. Namely, for our purpose here we will use the following lower estimate for
[TABLE]
for all . This is proved in [5], see also [6]. Using this inequality we get immediately that for any of size we have
[TABLE]
So, a.a.s., in no set of size induces a graph with more than edges, where . Now the desired upper bound follows from Lemma 1.
We have also tried several other ideas attempting to obtain a better upper bound. Unfortunately, they did not lead to improvements, therefore we place the discussion of these ideas to Appendix.
4 Lower bounds in terms of average degree
In this section, we obtain some general lower bounds for modularity. In particular, the obtained bounds are useful for graphs with bounded average degree. In Section 5, we apply these results to obtain a lower bound for the modularity of preferential attachment model (see Theorem 10).
Let us start with the analysis of trees. It was proven in [38] that trees with maximum degree have asymptotic modularity 1. We generalize this result in two ways: first, we relax the condition on maximum degree; second, we allow our graphs to be disconnected, that is, we consider forests instead of trees. We prove the following theorem.
Theorem 5
Let be a sequence of forests, where has non-isolated vertices and the maximum degree . Then the following lower bound holds
[TABLE]
This theorem implies that if the maximum degree , then . Note that it is also known that the asymptotic modularity of trees with maximum degree is strictly less than 1 [38]. Hence, the assumption cannot be eliminated.
We further generalize the above theorem to all connected graphs and prove the following result.
Theorem 6
Let be a sequence graphs, where is a connected graph on vertices with the maximum degree and the average degree . Then
[TABLE]
The theorem implies that if for some constant and , then . Note that for Theorem 6 looks similar to Theorem 5. However, there are two important differences: Theorem 6 is not restricted to forests, but requires graphs to be connected.
Before we prove both theorems let us introduce some notation and the main lemma which we will use.
Definition 7
Let be a graph and let be any subset of its vertex set . We define , where is the degree of a vertex in . We also use the notation , where is a subgraph of .
Lemma 8
For every connected graph with maximum degree and every there exists a partition of the vertex set into connected parts such that . for all .
*Proof. * For a graph let us consider its spanning tree and decompose it, by removing some edges, into subtrees such that for each . The way we do this decomposition is in a sense similar to the algorithm greedy-decompose≤h from [38]. Namely, we first redefine a notion of a centroid edge of a subtree of the initial tree .
Definition 9
The removal of any edge from a tree splits into two parts and . A centroid edge of is an edge chosen to maximize .
Our algorithm is the following: as long as our forest contains a tree with , it finds a centroid edge of and removes it. After this decomposition, we obtain trees and we set for .
Obviously, for each we have . Let us show that we also have . Consider any step of our decomposition procedure. We take a tree with , remove its centroid edge , and obtain two trees and . Without loss of generality we may assume that . Let , . Let be the vertex incident with and belonging to . For every edge incident with , for the part of not containing we have (otherwise is not a centroid edge). As has degree at most , we have (at most for each of the parts plus the degree of itself). So, . This proves that and completes the proof of the lemma.
Now, we are ready to prove Theorem 6 and Theorem 5.
*Proof. * (Proof of Theorem 6.) Let us take and partition into according to Lemma 8. To obtain the desired lower bound, we estimate for . We first deal with the edge contribution. As stated in Lemma 8, we have for all . Also, . Therefore, . The number of intracluster edges in the spanning tree is , and clearly this is the lower bound for . Finally,
[TABLE]
It remains to estimate the degree tax. Recall that a for all and . Therefore,
[TABLE]
and so the proof is finished.
*Proof. * (Proof of Theorem 5.) This proof is similar to the previous one. Let us fix . The idea is to partition into such that for each : and a subgraph induced by is a tree. Our forest may already contain trees with . Let us denote the corresponding vertex sets by . We decompose the remaining trees according to Lemma 8 (applied to each tree separately) into .
Now we have the partition of the vertex set . In order to estimate we first consider the edge contribution. According to Lemma 8, for . Therefore, it is easy to show that for each intercluster edge we can find at least inracluster edges. Hence,
[TABLE]
It remains to estimate the degree tax. Recall that for and . Therefore,
[TABLE]
and so the proof is finished.
5 The Preferential Attachment model
5.1 Lower bound
The following theorem easily follows from the results of the previous section.
Theorem 10
For any a.a.s.
[TABLE]
*Proof. * Let . It is well-known that a.a.s. (see, e.g., [22] and Theorem 17 in [12]). Also, clearly the average degree of is at most (it can be less due to the removal of loops and multiple edges). In addition, for a.a.s. is connected [13]. So, the statement of Theorem 10 follows directly from Theorems 5 and 6.
We would like to remark that the obtained lower bound holds for many other models of complex networks. For example, it holds for the Random Apollonion Network [50] (in this case ) or for the Buckley-Osthus model [17] (with slightly corrected error term).
As in the case of random -regular graphs, it is natural to conjecture that the above lower bound is not sharp. Let and consider the following partition: , . Using martingales, it is possible to show that a.a.s. (and so ); see Lemma 11 below. Clearly, and so a.a.s. and . The edge contribution and the degree tax are then both asymptotic to . Not surprisingly, such partition cannot be used to get a non-trivial lower bound for the modularity but, similarly to the situation for random -regular graphs, we may try to use it as a starting point to get slightly better partition. The basic idea is very simple: one can start with a given partition (or partition the vertices randomly into two classes), and if a vertex has more neighbours in the other class than in its own, then we randomly decide whether to shift it to the other class or leave it where it is. This approach proved to be useful to get a bound for the bisection width in random -regular graphs [3] which, in turn, yields a lower bound for the modularity [36]. In the proceeding version of this paper [44] we promised to investigate this approach. However, the following turns out to be slightly easier to do.
We will use the following standard martingale tool: the Hoeffding-Azuma inequality; for more details, see, for example, [30]. Let be a martingale. Suppose that there exist such that for each . Then, for every ,
[TABLE]
The Hoeffding-Azuma inequality can be generalized to include random variables close to martingales. One of our proofs, proof of Lemma 11, will use the supermartingale method of Pittel et al. [46], as described in [49, Corollary 4.1]. Let be a sequence of random variables. Suppose that there exist and such that
[TABLE]
for each . Then, for every ,
[TABLE]
Let us now prove the following lemma.
Lemma 11
Fix any constant and . The following property holds a.a.s. for . For any , ,
[TABLE]
where .
*Proof. * In view of the identification between the models (on the vertex set ) and (on the vertex set ), it will be useful to investigate the following random variable instead of : for , let
[TABLE]
Clearly, . It follows that . Moreover, for ,
[TABLE]
The conditional expectation is given by
[TABLE]
Taking expectation again, we derive that
[TABLE]
Hence, it follows that
[TABLE]
In order to transform into something close to a martingale (to be able to apply the generalized Azuma-Hoeffding inequality (9)), we set for
[TABLE]
(note that ) and use the following stopping time
[TABLE]
Indeed, we have for
[TABLE]
provided , and as . Let denote . We apply the generalized Azuma-Hoeffding inequality (9) to the sequence , with , and , to conclude that a.a.s. for all such that
[TABLE]
To complete the proof we need to show that a.a.s. . The events asserted by the equation hold a.a.s. up until time , as shown above. Thus, in particular, a.a.s.
[TABLE]
which implies that a.a.s. In particular, it follows that a.a.s., for any , . The lower bound can be obtained by applying the same argument symmetrically to , and so the proof is finished.
Now, we are ready to prove the following, stronger, lower bound.
Theorem 12
A.a.s.:
[TABLE]
That is, a.a.s.
[TABLE]
In particular, a.a.s. .
Before we prove the theorem, let us present numerical values for a few values of : is the lower bound following from Theorem 10 and is the lower bound from Theorem 12; see Table 2. Large degree tax hidden in makes this bound weaker for small values of ; for larger values is better than .
*Proof. * Let be any constant. Let us start with generating ; vertices from are coloured red and vertices from are coloured blue. We will continue generating , colouring vertices red or blue (one by one, as they are introduced in the process), depending on how many of their neighbours are of each colour. We want to control the sum of degrees of vertices in each colour; that is, the following random variable
[TABLE]
The colouring process depends on the parity of . If is even, we colour vertex red if more than neighbours (in ) are red. If the number of red neighbours is precisely , we colour it red with probability , where will be determined soon. Otherwise, is coloured blue. If is odd, the process is slightly different. If the number of red neighbours is more than , we colour it red. If it is or , we colour it red with probability and, respectively , where . Otherwise, is coloured blue. The arguments for both cases are almost identical so we assume now that is even; it will be clear what needs to be adjusted for odd value of . In both situations, our hope is that the two graphs, induced by red and blue vertices, will be dense.
It follows from Lemma 11 that a.a.s. , so we may assume that this inequality holds. This time we use the following stopping time
[TABLE]
Arguing as in the previous lemma, we get that
[TABLE]
provided that . Since
[TABLE]
and
[TABLE]
we get that
[TABLE]
Since , we can adjust so that ; that is, the sequence of random variables is a martingale. It follows from the classic Hoeffding-Azuma inequality (8), applied to with and , that a.a.s., for each ,
[TABLE]
The rest of the proof is straightforward. We partition the vertex set of into red and blue vertices. The degree tax is a.a.s.
[TABLE]
It remains to estimate the edge contribution. Clearly, the process guarantees that at least half of the edges are within the two clusters. However, we will do slightly better than that. For any , with probability asymptotic to , at any point of the process we add edges to some cluster; edges are added with probability asymptotic to . Hence, the expected number of edges added to some cluster is asymptotic to
[TABLE]
The expected edge contribution is then asymptotic to
[TABLE]
Finally, one can bound the edge contribution (independently, from above and from below) by the sum of independent random variables, and use Chernoff bound to get a concentration. It follows that a.a.s.
[TABLE]
and the result holds after taking sufficiently slowly.
Finally, some elementary calculations show that for any , we have
[TABLE]
see, for example, [15]. (More general and precise bounds can be found in [21].) It follows that a.a.s. , and the proof is finished.
5.2 Upper bound
Recall that the edge expansion of a graph is defined as follows:
[TABLE]
In [37] it was shown that a.a.s. , provided that . In other words, for any we have that a.a.s.
[TABLE]
Using this observation one can easily obtain a non-trivial upper bound for .
Let be an arbitrary small constant. Consider any partition of the vertex set . If for some , then the degree tax is at least
[TABLE]
On the other hand, if for all , then a.a.s. the number of edges between parts is equal to
[TABLE]
and so the edge contribution is a.a.s. at most
[TABLE]
for any . Therefore, the following result holds.
Theorem 13
For any a.a.s.
[TABLE]
Moreover, for any a.a.s.
[TABLE]
Some stronger expansion properties were recently obtained in [27]. However, whereas they presumably could be used to obtain some small improvements for an upper bound of (for specific values of ), we do not know how to show that as . Perhaps as in the case of random -regular graphs?
6 The Spatial Preferential Attachment model
Consider , a graph generated by the SPA model. As the modularity is defined for undirected graphs, we consider that is a graph obtained from by replacing each directed edge by undirected edge . (As edges in are always from ‘younger’ to ‘older’ vertices, there is no problem with generating multigraph; is a simple graph.) Let us recall that where is the unit hypercube . We will use the geometry of the model to obtain a suitable partition that yields high modularity of . The following properties (proved many times; see, for example, [1, 19]) are the only properties of the model that will be used in the proof: a.a.s. for every pair such that we have that
[TABLE]
and . Since we aim for a result that holds a.a.s., we may assume in the proof below that these properties hold deterministically. Now, we are ready to state our result for the SPA model.
Theorem 14
Let , , and suppose that . Then, the following holds a.a.s.:
[TABLE]
*Proof. * Let . Note that for some that depends on the parameters of the model. Let us partition the space into parts as follows: for each integer ,
[TABLE]
This partition of naturally gives us a partition of the vertex set: for each , . We will show that a.a.s.
[TABLE]
which will finish the proof as and always .
First, let us start with estimating the edge contribution. In order to do that, we need to estimate the number of edges between different parts. So, let us focus on any part . We will investigate how many bad edges in connect vertices outside of with vertices inside by counting (independently) bad edges directed to vertices of similar age. (Note that for convenience we consider here directed graph instead of .) For a given integer such that , let
[TABLE]
It is clear that and are partitions of the vertex set and the edge set (both in and ), respectively, and so is a partition of the bad edges we want to count. It remains to estimate the size of for a given value of .
Fix , and let us concentrate on any . It follows from (10) that the maximum volume of a sphere of influence of is (during the whole process) and so the maximum radius of influence of is . Therefore, if there is an edge in the cut directed to , then must fall not only into but also into a strip within distance from one of the two cutting hyperplanes separating from the neighbouring parts; that is, or . Since , we get that
[TABLE]
vertices of are expected to appear in these two strips during the whole process. Hence, it follows from Chernoff bound that with probability at least there are vertices in these strips at the end of the process. Note that the exponent of has changed from to in order to guarantee the claimed upper bound is at least which is required for a bound to hold with the desired probability. Using (10) one more time, we get that all vertices introduced in this time period have (final) in-degree . Hence, there are
[TABLE]
edges in the cut with probability at least and so this property holds a.a.s. for all parts and all values of . It follows that a.a.s. the number of bad edges involving is at most
[TABLE]
Finally, we get an estimate for the edge contribution: a.a.s.
[TABLE]
It remains to estimate the degree tax. In order to do that we need to, for a given under consideration, estimate in ; that is, in . As before, we partition the vertices of into sets containing vertices of similar age. Let be the largest integer such that . Clearly, . This time, for a given integer such that , let
[TABLE]
and our goal is to estimate the size of . The expected number of vertices of that fall into is and it follows from Chernoff’s bound that with probability at least it is . Using (10) for the last time, we get that all vertices introduced in this time period have (final) in-degree , provided ; and for . It follows that with the desired probability
[TABLE]
and so it holds a.a.s. for all . Similarly, using Chernoff’s bound and (11) we get that a.a.s. for all we have and so
[TABLE]
Finally, we are able to get an estimate for the degree tax in : a.a.s.
[TABLE]
and the proof is finished.
7 Discussion and future research
In this paper, we investigated modularity and provided precise theoretical bounds for several random graph models, such as random -regular graphs, constant average degree graphs, preferential attachment and SPA models. However, there are plenty of directions for future research. For example, for preferential attachment model we expect that . However, even the fact that as is still unproven.
Also, in this paper we studied the most popular version of modularity, while other definitions (suitable for some particular clustering problems) were proposed in the literature (see discussion in [24]). For example, it was proposed to multiply the degree tax by a resolution parameter . Note that most of our results can be easily extended to such definition, as we separately estimate edge contribution and degree tax. Also, Erdős–Rényi random graph model can be used as a null model (instead of the pairing model) to compute the degree tax. This version of modularity is much easier to analyze, but such null model cannot describe real networks well, since it has an unrealistic Poisson degree distribution.
Finally, we would like to note that there is another model, which, similarly to SPA, combines geometry and preferential attachment [23]. It would be interesting to investigate the modularity for this model and we expect that its modularity tends to 1 (as for the SPA model). However, these two models are different and our result does not imply anything for the other model.
Acknowledgements
This work is supported by Russian Science Foundation (grant number 16-11-10014), NSERC, The Tutte Institute for Mathematics and Computing, and Ryerson University.
Appendix
7.1 Random -regular graphs, some ideas for an upper bound
Idea 1: Recall that in order to get the current best upper bound we showed that a.a.s. no set of size induces more than edges. As a result the largest value of in (4) is at most . For example, for the optimal choice that maximizes is: , , and so as reported in Table 1. However, clearly it is impossible to partition a graph precisely into parts of size . It is possible to show that the following upper bound holds, which is clearly not larger than the previous one:
[TABLE]
Unfortunately, this maximum value is achieved for (which corresponds to parts of size roughly , and no improvement is achieved: . The reason this idea fails is that the optimal value of is small so that rounding to the nearest integer for does not improve the bound much.
Idea 2: Let us look at (4) again but this time let us order the terms so that
[TABLE]
It follows that
[TABLE]
It is slightly more tedious than before, but one can get an improvement by considering (ordered) disjoint pairs of vertices with , , , , and . Unfortunately, this idea also does not provide any reasonable improvement. For , the expected number of pairs of sets for the following vector and, again, no substantial improvement is achieved: .
Idea 3: As before, let us concentrate on the case but similar ideas can be used for any integer . We can try to use the fact that can be constructed by putting a random matching on the vertices of a Hamiltonian cycle. Let us fix any set of vertices of size that induces components (paths) by restricting only to edges of the Hamiltonian cycle. Each such set can be represented by the following triple: vertex , vector , and vector : starts some path, is the number of vertices on path , is the number of vertices not in the set and right after path . The number of such sets is at most . The number of edges within this set that are part of the Hamiltonian cycle is . Hence, in order for the set to induce edges, edges must be coming from the matching.
The hope is (that is, was) that for small values of , there are only a few sets to consider. On the other hand, if is closer to , then less edges are “for free” (edges of the Hamiltonian cycle). Unfortunately, again this idea does not lead to any substantial improvement. Concentrating on , , , and tuning , the expected number of such sets is tending to infinity as .
Conclusion: The lack of improvement is disappointing but perhaps should not be surprising. Looking at one or two parts of a partition maximizing is not enough (local property). Having one large term in (4) might be possible but having all of them to be large perhaps is not. So in order to improve the upper bound, one needs to consider all parts at the same time (global property).
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] W. Aiello, A. Bonato, C. Cooper, J. Janssen, and P. Prałat, A spatial web graph model with local influence regions, Internet Mathematics 5 (2009), 175–196.
- 2[2] R. Albert, A.-L. Barabási, Statistical mechanics of complex networks, Reviews of modern physics, vol. 74 (2002), 47–97.
- 3[3] N. Alon, On the edge-expansion of graphs, Combinatorics, Probability and Computing 6 (1997), 145–152.
- 4[4] N. Alon and F.R.K. Chung, Explicit construction of linear sized tolerant networks, Discrete Math., 72 (1988), 15–19.
- 5[5] N. Alon and V.D. Milman, λ 1 subscript 𝜆 1 \lambda_{1} , isoperimetric inequalities for graphs and superconcentrators, J. Combinatorial Theory, Ser. B 38 (1985), 73–88.
- 6[6] N. Alon and J.H. Spencer, The Probabilistic Method, Wiley, 1992 (Second Edition, 2000).
- 7[7] S. Bansal, S. Khandelwal, L.A. Meyers, Exploring biological network structure with clustered random networks, BMC Bioinformatics, 10:405 (2009)
- 8[8] A.L. Barabási, R. Albert, Emergence of scaling in random networks, Science 286 (1999) 509–512.
