Mechanisms for tuning clustering and degree-correlations in directed networks
G. Kashyap, G. Ambika

TL;DR
This paper introduces degree-preserving rewiring methods to tune clustering and degree-correlations in directed networks, providing null-models to analyze structural properties and their dependence on topology and link-density.
Contribution
The authors develop novel rewiring mechanisms for directed networks that independently control clustering and degree-correlations, revealing their structural relationships and topological dependencies.
Findings
Structural relationships in clustering are topology-independent.
Degree-correlations depend on network topology and are unaffected by link-density at large rewiring steps.
Rewiring mechanisms serve as tools for structural analysis and constraint identification.
Abstract
With complex networks emerging as an effective tool to tackle multidisciplinary problems, models of network generation have gained an importance of their own. These models allow us to extensively analyze the data obtained from real-world networks, study their relevance and corroborate theoretical results. In this work, we introduce methods, based on degree preserving rewiring, that can be used to tune the clustering and degree-correlations in directed networks with random and scale-free topologies. They provide null-models to investigate the role of the mentioned properties along with their strengths and limitations. We find that in the case of clustering, structural relationships, that are independent of topology and rewiring schemes are revealed, while in the case of degree-correlations, the network topology is found to play an important role in the working of the mechanisms. We also…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Mechanisms for tuning clustering and degree-correlations in directed networks
G. Kashyap
G. Ambika
Indian Institute of Science Education and Research, Pune, India - 411008
Abstract
With complex networks emerging as an effective tool to tackle multidisciplinary problems, models of network generation have gained an importance of their own. These models allow us to extensively analyze the data obtained from real-world networks, study their relevance and corroborate theoretical results. In this work, we introduce methods, based on degree preserving rewiring, that can be used to tune the clustering and degree-correlations in directed networks with random and scale-free topologies. They provide null-models to investigate the role of the mentioned properties along with their strengths and limitations. We find that in the case of clustering, structural relationships, that are independent of topology and rewiring schemes are revealed, while in the case of degree-correlations, the network topology is found to play an important role in the working of the mechanisms. We also study the effects of link-density on the efficiency of these rewiring mechanisms and find that in the case of clustering, the topology of the network plays an important role in determining how link-density affects the rewiring process, while in the case of degree-correlations, the link-density and topology, play no role for sufficiently large number of rewiring steps. Besides the intended purpose of tuning network properties, the proposed mechanisms can also be used as a tool to reveal structural relationships and topological constraints.
I Introduction
Large-scale global connectivity has become an indispensable part of our lives and as a consequence, complex networks have gained an enormous importance in multiple fields of research and applications Newman (2002a); Strogatz (2001); Albert and Barabási (2002); Boccaletti et al. (2006). With rapid advances in technology, abundant data about these systems has been made available and this is redefining how we perceive and understand complex networks. Analyses of these datasets have revealed a multitude of properties associated with the respective networks. Though some of these properties could be causal or consequential, some structural and functional and others independent or interrelated, it is clear that our understanding of complex networks depends crucially on our understanding of their associated properties, origins and relationships. Although, in a lot of cases, we lack proper understanding of the origins of these properties, we are nevertheless interested in studying how they affect each other and the overall structure and functioning of the network. To this end, models of network generation, both growth mechanisms Barabási and Albert (1999); Barabási et al. (1999); Dorogovtsev et al. (2000); Price (1976); Bollobás et al. (2003); Krapivsky and Redner (2001); Krapivsky et al. (2000); Ostroumova et al. (2013) and static mechanisms Erdös and Rényi (1959); Watts and Strogatz (1998); Bollobás (1998); Kim et al. (2012), have become an integral aspect of the study of complex networks.
In this regard, considerable work has been done on generating scale-free (SF) networks, using both growth and static methods. Further properties, like degree-correlations and clustering have also been incorporated into static methods Newman (2003a, 2002b); Noldus and Van Mieghem (2015); Newman (2009); Park and Newman (2005); Newman (2003b) and growth models Szabó et al. (2004); Klemm and Eguiluz (2002); Vázquez (2003); Herrera and Zufiria (2011); Krot and Prokhorenkova (2015). Despite the availability of extensive literature, we find that the bulk of it is limited mostly to undirected networks. In spite of their abundant occurrence and rich structural variety, directed networks have been given significantly less attention Fagiolo (2007); Williams and Del Genio (2014); Foster et al. (2010); Mayo et al. (2014); Roberts and Coolen (2012) or have been approximated to the undirected case.
In this work, we focus our attention on directed networks, with 2 important properties: 2-node degree-correlations and clustering. We use existing models to generate directed SF and random networks. We then propose Degree Preserving Rewiring (DPR) mechanisms to introduce and tune the properties of interest in these networks. Methods based on DPR allow us to isolate the role of the degree distribution and related intrinsic structural properties and focus on the aspects that are unique to the network under consideration Roberts and Coolen (2012)Van Mieghem et al. (2010)Zhou and Mondragon (2007). We compare and contrast the performance of these mechanisms on the 2 types of networks and investigate their effects and side-effects. We also study the effect of link density on the performance of these techniques.
II DPR methods for tuning of clustering
In networks, the tendency for nodes to organize into well-knit neighborhoods or form small cliques, is referred to as clustering, and is quantified by the Mean Clustering Coefficient (MCC). These substructures, occur spontaneously in most networks and play an important role in the spreading of epidemics in large communities and also effect network flow in local neighborhoods.
The MCC of a network is defined as the average of Local Clustering Coefficients (LCCs) of all nodes in the network. The LCC of a node is the ratio of number of pairs of connected neighbors of the node to the number of pairs of neighbors of the node. In other words, it measures the extent to which the neighbors of a node form closed triplets with the node of interest.
In directed networks, as a consequence of the directionality of edges, 4 different types of simple closed triplets are possible Fagiolo (2007). The 4 types of triangles, namely, Cycles, Middleman-triangles (Mids), In-triangles (Intri) and Out-triangles (Outri) are shown in fig.1.
Given a network of size N, represented by an adjacency matrix A, and where , and are the in-degree, out-degree and bidirectional edges, respectively, of node i, the MCCs w.r.t each of the triangles can be calculated as follows:
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
To improve the amount of clustering in the network, we employ mechanisms that identify a suitable chain of connected nodes, that can then be reconfigured to give a closed triplet of the desired form, while preserving the degrees of the nodes involved. While the mechanisms are distinct for different types of triplets, they only differ very little from each other based on the triplet of interest.
To improve the amount of clustering w.r.t cycles, we use the following procedure:
Step-1: Calculate the initial value of using (1a). Select a node i, whose neighborhood can be suitably modified. While there is no fixed manner in which i can be chosen, the only condition that needs to be satisfied is that i must have at least 1 incoming link and 1 outgoing link. In other words, the in and out degrees of i must be greater than or equal to 1. It is also worth mentioning that i can be chosen either uniformly randomly or in a weighted manner, where the weights could be LCC of i, in-degree of i or out-degree of i. Although a weighted selection of i would seem to be more effective, we find that, for a given number of rewiring steps, the networks undergo approximately the same amount of change in clustering. Therefore, for the purpose of this work, i is chosen uniformly randomly.
Step-2: From the list of incoming neighbors of i, we randomly select a node j, and from the list of outgoing neighbors of i, we select k.
Step-3: From the list of incoming neighbors of j, we randomly select a node m, and from among the outgoing neighbors of k, we select n. If it is not possible to make either of the selections, we go back to step-1.
Step-4: If nodes m, n and i are distinct, and edges (m,n) and (k,j) do not exist, then edges (m,j) and (k,n) are rewired to (k,j) and (m,n). In this process, a cyclic triplet (j,i,k) is created.
Step-5: Calculate of the network. If there is an overall increase from the initial value, then retain the change and go back to step-1. If there is a decrease in clustering, then reject the rewiring and go back to step-1.
The above process is iterated for a predetermined number of steps or until a predetermined value of clustering coefficient is reached.
To improve the amount of clustering w.r.t mids, we use the following procedure:
Step-1: Calculate the initial value of from (1b). Randomly select a node i, with both in and out-degrees greater than or equal to 1.
Step-2: Select node j from among the incoming neighbors of i and node k from among the outgoing neighbors of i.
Step-3: Further, select node m from among the outgoing neighbors of j and node n from the incoming neighbors of k. If any one of them is not possible, then go back to step-1.
Step-4: If m, n and i are distinct and edges (j,k) and (n,m) do not exist, then edges (j,m) and (n,k) are rewired to (j,k) and (n,m), forming a middleman triangle (j,i,k).
Step-5: Calculate of the network. If there is an overall decrease in clustering, then reject the rewiring and go back to step-1, and in case of an overall increase, retain the rewiring and go to step-1.
The above process is iterated for a predetermined number of steps or until a predetermined value of clustering is reached.
To improve the amount of clustering w.r.t intri, we use the following procedure:
Step-1: Calculate the initial value of using (1c). Randomly select a node i, with in-degree greater than 1.
Step-2: Select nodes j and k from among the incoming neighbors of i.
Step-3: Further, select node m from the neighbors (in and out) of j and node n from the neighbors (in and out) of k. If any one of them is not possible, then go back to step-1.
Step-4: If m, n and i are distinct and edges (j,k) and (n,m) do not exist and (j,m) and (n,k) exist, then edges (j,m) and (n,k) are rewired to (j,k) and (n,m), to form an in-triangle (j,i,k). Else, if nodes m, n and i are distinct, and edges (m,n) and (k,j) do not exist and (m,j) and (k,n) exist, then edges (m,j) and (k,n) are rewired to (k,j) and (m,n), once again forming an in-triangle (j,i,k).
Step-5: Calculate of the network. If there is an overall decrease in clustering, then reject the rewiring and got back to step-1, and in case of an overall increase, retain the rewiring and go to step-1.
The above process is iterated for a predetermined number of steps or until a predetermined value of clustering is reached.
To improve the amount of clustering w.r.t outri, we use the following procedure:
Step-1: Calculate the initial value of using (1d). Randomly select a node i, with out-degree greater than 1.
Step-2: Select nodes j and k from among the outgoing neighbors of i.
Step-3: Further, select node m from the neighbors (in and out) of j and node n from the neighbors (in and out) of k. If any one of them is not possible, then go back to step-1.
Step-4: If m, n and i are distinct and edges (j,k) and (n,m) don’t exist and (j,m) and (n,k) exist, then edges (j,m) and (n,k) are rewired to (j, k) and (n, m), to form an out-triangle (j,i,k). Else, if nodes m, n and i are distinct, and edges (m, n) and (k, j) do not exist and (m,j) and (k,n) exist, then edges (m,j) and (k,n) are rewired to (k,j) and (m,n), once again forming an out-triangle (j,i,k).
Step-5: Calculate of the network. If there is an overall decrease in clustering, then reject the rewiring and go back to step-1, and in case of an overall increase, retain the rewiring and go to step-1.
The above process is iterated for a predetermined number of steps or until a predetermined value of clustering is reached.
III Results for tuning of clustering
We test the performance of the proposed mechanisms, presented in the previous section, on a directed SF network and a directed random network and study the effectiveness of these mechanisms in generating the properties of interest. We use the directed configuration model Kim et al. (2012) to generate a random network, while the SF network is generated using the model given by Bollobas et al. in Bollobás et al. (2003). The working of the mechanisms are first studied for a given parameter value of the networks and then a further investigation is conducted for different parameter values of the respective networks. In the SF network model, the link-density() is parametrized by . We see from Bollobás et al. (2003) that in the N limit, = /(1-). This allows us to compare performances of the rewiring mechanisms in SF networks and ER networks of approximately the same link-density.
To study the generation of any type of triangles, we generate 2 ensembles, each consisting of 100 networks. The first ensemble contains directed ER (random) networks of size N = and link-density equal to 5 and the second ensemble contains SF networks of size N = and = 0.8. The DPR method for the clustering of interest is applied iteratively to these ensembles of networks and the average results are shown in fig.6.
From fig.6, we observe that for a given N and approximately the same average degree, the mechanisms affect a substantially greater change in the ER network compared to the SF network, for the same number of rewiring steps. This is particularly pronounced in the case of , where DPR introduces twice the amount of change in ER networks than in the SF networks. The MCCs show a rapid initial increase following which they remain more or less constant or show a very slow increase.
We also notice how certain structural side-effects of the DPR mechanisms manifest themselves, independent of the network topology. For example, when a network is rewired to increase , there is noticeable increase in and . This can be explained by analyzing fig.5. When the node i is selected and the links between its first and second neighbors are rewired, as a consequence of the rewiring, 2 complementary triangles, w.r.t its first neighbors, are also created. We also notice that there is almost equal increase in both and . A similar effect is observed in the case of , where and show an increase. In the case of , there is an increase in both and , but unlike in the former 2 cases, the amount of change in MCC w.r.t the complementary triangles is not equal. There is greater increase in than . Also, this is observed only in the case of SF networks. However, none of this is observed in the case of as the complementary triangles formed are also cycles.
From fig.7, we see that in SF networks, different MCCs show the same qualitative behavior but different quantitative behaviors. They get rewired to different values for the same number of rewiring steps. This requires further investigation to ascertain if it is a structural limitation. reaches a value of 0.6 for = 0.8 while reaches only as high as 0.3 and and go higher upto 0.4. The behavior of all 4 MCCs is consistent with increasing values MCCs for gradually increasing values of . In the ER network, the final values of MCCs are not as smoothly varying as in the SF network. and have values clustered close together for = 2, 3, 4, 5 while the values for and are more smoothly distributed. Overall, on comparing the results in both the network types, we find that all the DPR mechanisms for clustering are affected to some extent by the network topology.
IV DPR methods for tuning of correlations
Degree-correlations measure the tendency of nodes to connect with other nodes with similar or dissimilar degrees. While it is not completely understood whether these correlations are the cause or consequence of other properties/processes, the importance of their effect on network structure and dynamics cannot be ignored. While correlations can be studied w.r.t any enumerative property (color, race etc.) or scalar property (age, income etc.) of the nodes, correlations between node-degrees gain further importance because of the interplay between the two structural properties involved.
In directed networks, since each node has an in-degree and an out-degree, 4 types of 2-node degree-correlations can be defined, namely, In-In, In-Out, Out-In and Out-Out degree correlations. Further in this paper, we will refer to a particular type of correlation as p-q, where p,q In,Out and p and q are associated with the source and target nodes respectively. The traditional metric used to quantify degree correlations is the Pearson correlation coefficient, , as introduced in Newman (2003a).
But, as was shown in Dorogovtsev et al. (2010)Litvak and Van Der Hofstad (2013)van der Hoorn and Litvak (2015), scales with the network size N and therefore, in the case of distributions with heavy tails, it converges to a non-negative number as N. As a result, it does not lend itself well to capture the dissortativity in large SF networks. For this reason, in this work, we turn to the Spearman’s rank correlation(2), to quantify these dependencies.
[TABLE]
where and are the ranks of source and target nodes associated with edge e, based on their p and q-degrees, respectively, and is referred to as the Spearman’s Rho van der Hoorn and Litvak (2015). Both and are bound in the range [-1,1], with the values being positive when nodes of similar degrees are connected by an edge and negative when nodes of dissimilar degrees are connected. The value becomes 0 when there is no net bias.
In order to design a mechanism to tune degree correlations, we start with the rewiring rule from Van Mieghem et al. (2010) and adapt it to the case of directed edges, where the identities of the source and target nodes need to be additionally preserved. We modify the procedure in Van Mieghem et al. (2010) in the following manner.
Given p,q in, out, we randomly choose 2 links, (a,b) and (c,d). From among the 2 source nodes, a and c, we select the node with higher p-degree and from the target nodes, b and d, we select the node with higher q-degree. If they aren’t identical or already connected by a link, then the existing links are deleted and new links are placed between the 2 selected nodes and the 2 remaining nodes. This is done to prevent the appearance of multi-edges and self-edges during the course of rewiring. Repeated iteration of the rewiring step (fig.8 (top)) generates a network that is assortative in p-q type of correlations.
To generate a network with dissortative p-q correlations, 2 links (a,b) and (c,d) are randomly chosen. From among the source nodes, the node with higher p-degree is selected and from the target nodes, the node with smaller q-degree is selected. Existing links are deleted and new links are placed between the 2 selected nodes and the 2 remaining nodes, while simultaneously ensuring that no multi-edges or self-loops are created. This rewiring step (Fig.8 (bottom)) is iterated over to incrementally increase the dissortativity in the network.
V Results for tuning of correlations
To study the generation of degree-correlations, we generate 2 ensembles, each containing 100 networks of size N = . The SF networks are generated with = 0.8 and the ER networks with = 5. The relevant assortative and dissortative rewiring mechanisms are iterated over on these ensembles and the results, that are ensemble averages, are shown in fig.9.
In ER networks, all variants of the mechanisms work perfectly and the networks show only the relevant p-q correlations (p,q in,out) that correspond to the respective rewiring mechanism and nothing else (fig.9 (right)). However, the SF networks present a more interesting scenario (fig.9 (left)). Although the p-q correlation of interest is introduced to the largest extent, other variants also arise with considerable magnitude. Since the same mechanisms do not result in a similar behavior in ER networks, it is evident that this behavior is not the result of rewiring itself but some other topological property. On examining the 2 types of networks for topological differences besides the distribution of degrees, we find them to be identical in all aspects except the case of 1-node correlation. We find that, in ER networks, there is no correlation between the in and out degrees of a given node, while in the case of SF networks, there is a very strong correlation. This can be traced back as an artifact of the construction process in Bollobás et al. (2003), where nodes appearing early in the growth process, have higher in and out degrees. To confirm 1-node correlations as the cause for the observed behavior, we introduce 1-node in-out correlations in ER networks in a systematic manner. Fig.10 shows the change in behavior of 2-node Out-Out correlation when the 1-node correlation is gradually increased.
Also, for the same number of rewiring steps, the absolute value of is smaller in the case of assortative rewiring and higher in the dissortative case. Another important feature is that, in the case of in-out and out-in correlations, the results for assortative and dissortative rewiring show qualitatively similar behavior with the 4 types of correlations showing 3 distinct values. This symmetric behavior is not seen in the case of in-in and out-out correlations, where the correlations taking the intermediate values further split up, so that the 4 variants take 4 distinct values. Further, this splitting only occurs in the case of assortative rewiring and not in the dissortative case, leading to further questions about 2-node and 1-node structural relationships.
In ER networks, assortative and dissortative rewirings are studied for = 2, 3, 4, 5. The behavior is identical in both cases, with higher link-densities showing slower rate of change (fig.11 (right)). We also see that for sufficiently large number of rewiring steps, the absolute values of , corresponding to all , converge to a large value close to 1. The results are similar in the case of dissortative rewiring in SF networks, for corresponding values of (fig.11 (left)). In assortative rewiring, however, values of do not quite converge for increasing values of . The network reaches lower values of for higher values of , for the same number of rewiring steps.
VI Conclusions
To summarize, we have presented DPR mechanisms to tune the amount of degree-correlations and clustering in directed networks with random and SF topologies. These mechanisms allow us to explore the relevant properties, independent of the topology, in our attempt to understand their role in the structural organization and functioning of the networks. They provide alternate ways to introduce and tune properties, especially when growth mechanisms fail, due to our lack of knowledge about the processes or mechanisms leading to these properties. Having mechanisms that can artificially tune the amount of clustering makes it easier to study the role of clustering in information dissemination in computer or social networks and rumor-spreading in online or offline communities. It also helps in designing and testing efficient mitigation strategies to contain epidemics in contact networks. In this regard, the ability to tune 2-node degree-correlations also plays an important role. Tuning correlations also helps us to study their effect on the robustness of networks under attack and during failure.
We find that, for both correlations and clustering, the density of links in the network does not affect any qualitative change in the working of the mechanisms. The general observation is that higher link-densities slow down the effects of rewiring. We also conclude that the topology itself affects the qualitative behavior of the mechanisms. There exist structural side-effects that are inherent in the definitions of the properties, as a result of which some properties cannot be manipulated in isolation. Basically, the results emphasize on the structural relationships present in directed networks, particularly in SF networks. Finally, although we set out to find mechanisms to tune clustering and degree-correlations in directed networks, we find that the same mechanisms double up as tools to explore further structural relationships in the networks.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Newman (2002 a) M. E. J. Newman, Computer Physics Communications 147 , 40 (2002 a).
- 2Strogatz (2001) S. H. Strogatz, Nature 410 , 268 (2001).
- 3Albert and Barabási (2002) R. Albert and A. L. Barabási, Reviews of modern physics 74 , 47 (2002).
- 4Boccaletti et al. (2006) S. Boccaletti, V. Latora, Y. Moreno, M. Chavez, and D. U. Hwang, Physics reports 424 , 175 (2006).
- 5Barabási and Albert (1999) A. L. Barabási and R. Albert, science 286 , 509 (1999).
- 6Barabási et al. (1999) A. L. Barabási, R. Albert, and H. Jeong, Physica A: Statistical Mechanics and its Applications 272 , 173 (1999).
- 7Dorogovtsev et al. (2000) S. N. Dorogovtsev, J. F. F. Mendes, and A. N. Samukhin, Physical review letters 85 , 4633 (2000).
- 8Price (1976) D. d. S. Price, Journal of the American society for Information science 27 , 292 (1976).
