Large deviation and anomalous fluctuations scaling in degree assortativity on configuration networks
Hanshuang Chen, Feng Huang, Chuansheng Shen, Guofeng Li and, Haifeng Zhang

TL;DR
This paper investigates the probability distribution of degree assortativity in networks, revealing large deviation principles and anomalous fluctuation scaling, especially in heterogeneous scale-free networks.
Contribution
It introduces a multicanonical Monte Carlo method to analyze the full distribution of degree assortativity and establishes a large deviation principle with a novel scaling exponent.
Findings
The distribution obeys a large deviation principle with a convex rate function.
The scaling exponent $\xi$ equals 1 for Poisson graphs and varies for scale-free networks.
Fluctuations exhibit anomalous scaling in highly heterogeneous networks.
Abstract
By constructing a multicanonical Monte Carlo simulation, we obtain the full probability distribution of the degree assortativity coefficient on configuration networks of size by using the multiple histogram reweighting method. We suggest that obeys a large deviation principle, , where the rate function is convex and possesses its unique minimum at , and is an exponent that scales 's with . We show that for Poisson random graphs, and for scale-free networks in which is a decreasing function of the degree distribution exponent . Our results reveal that the fluctuations of exhibits an anomalous scaling with in highly heterogeneous networks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Large deviation and anomalous fluctuations scaling in degree assortativity on configuration networks
Hanshuang Chen1
Feng Huang2
Chuansheng Shen3
Guofeng Li1
Haifeng Zhang4
1School of Physics and Optoelectronics Engineering, Anhui University, Hefei 230601, China
2School of Mathematics and Physics, Anhui Jianzhu University, Hefei 230601, China
3School of Mathematics and Physics, Anqing Normal University, Anqing 246133, China
4School of Mathematical Science, Anhui University, Hefei 230601, China
Abstract
By constructing a multicanonical Monte Carlo simulation, we obtain the full probability distribution of the degree assortativity coefficient on configuration networks of size by using the multiple histogram reweighting method. We suggest that obeys a large deviation principle, , where the rate function is convex and possesses its unique minimum at , and is an exponent that scales ’s with . We show that for Poisson random graphs, and for scale-free networks in which is a decreasing function of the degree distribution exponent . Our results reveal that the fluctuations of exhibits an anomalous scaling with in highly heterogeneous networks.
pacs:
89.75.Hc, 05.45.Xt, 89.75.Kd
I Introduction
Over the past two decades, we have witnessed the success of complex networks in describing the pattern discovered ubiquitously in real world [1], such as community structure and scale-free structure, and modelling many dynamical processes in nature [2], such as synchronization [3, 4], epidemic spreading [5], opinion formation [6], etc [7]. In particular, how to characterize the structural features of complex networks is essential not only for uncovering the organizational principles of real systems, but also for understanding and controlling the dynamical processes on them [8, 9, 10].
An important feature in complex networks is so-called degree assortativity, which quantifies the tendency of nodes to be connected to other nodes of similar degree. A networks is called assortative if nodes with high degree preferably connect to other nodes with high degree, and dissortative if nodes with high degree are linked to nodes with low degree. Technical and biological networks have been found to be dissortatively mixed, while social networks show assortative correlations [11, 12, 13]. It was shown that, on the one hand, degree correlations are key to many structural properties of networks, such as percolation [12, 14], mean distance [14], and robustness [13, 15]. On the other hand, degree correlations affect the properties of dynamical processes taking place on networks, such as epidemic spreading [16, 17, 18], stability against stimuli and perturbation [19, 20], and synchronization of oscillators [21].
In his seminal papers [12, 13], Newman introduced the assortativity coefficient to measure the degree correlation, which is defined as
[TABLE]
where is the number of edges, and , are the degrees of the nodes at the ends of the th edge, with . The assortativity coefficient is actually the Pearson’s correlation coefficient between the degrees of neighboring nodes, which is supposed to have natural bounds . A network is assortative when and disassortative when .
Most of previous works on this subject were performed on scale-free networks with power-law degree distributions [22, 23, 24, 25, 26, 27, 28, 29]. It has been shown that, on the one hand, for degree distribution exponent the assortativity coefficient is usually negative in finite-size networks. On the other hand, always decreases in magnitude as network size increases, and equals to zero in the infinite networks. Maslov et al. [22] have shown by using computer simulations that the degree dissortativity results from the restriction of at most one edge between any pair of nodes. Furthermore, Park and Newman [23] verified this result in theory. They proposed a grand canonical ensemble of graphs such that analytical calculation of degree correlations becomes feasible. Johnson et al. [24] proposed an alternative explanation for the phenomenon by information entropy, and they showed that the Shannon entropy is maximized at some negative value of assortativity coefficient for highly heterogeneous scale-free networks. Menche et al. [25] analyzed the maximally disassortative scale-free networks and found that the lower bound of approaches to zero as network size increases in a power-law way. Dorogovtsev et al. [26] also found the results in a specific class of recursive trees with power-law degree distribution. Yang et al. [27] derived analytically the lower bound of assortativity coefficient in scale-free networks. Similar phenomenon was also discussed in some related works [28, 29], although the authors therein argued the availability of the Pearson’s coefficient for measuring degree correlations in large-size heavy-tailed networks, and alternatively they proposed other measurements such as Kendall-Gibbons’ [28] and Spearman’s [29].
Previous works mainly focused on either the typical behavior of , such as how the expected value of changes with network size and degree heterogeneity [22, 23, 24, 26], or how to obtain a class of specific networks with some atypical value of [14, 25, 27]. For an ensemble of random networks with a given degree sequence (i.e. configuration model), it is known that the assortativity coefficient varies from one network realization to another. An interesting question arises: what is the probability of generating a configuration network whose assortativity coefficient falls in an interval ? The question is equivalent to finding the probability distribution function of with network size . For the purpose, we shall employ a statistical-mechanics inspired Monte Carlo (MC) method, multiple histogram reweighting (MHR) [30, 31], to fully sample over a wide range of . The method is computationally efficient and enable us to cover rare-event tails with very low probabilities of . Recently, the MHR method was applied to investigate the large deviation properties of the largest connected [32] or biconnected component [33], the diameters [34] for random graphs, and resilience of transportation networks [35] as well as power grids [36]. Related algorithms [37], for example, Wang-Landau algorithm [38], has been used to efficiently sample large spectral gap [39] and prescribed motif densities in networks [40], and rare trajectories in chaotic systems [41, 42].
To that end, we first build a canonical ensemble MC sampling by a random edge-swapping scheme [43] and then collect a series of histograms of at different inverse temperatures. Finally, is obtained by using the MHR method. By implementing the method on the configuration models with Poisson degree distributions and power-law degree distributions, we find that for all the cases under consideration is unimodal and its width becomes narrower as increases. The expected value of is negative and decays in magnitude as increases in a power-law way, as reported in previous literatures. The variance of decreases in power-law form, , with the increase of as well. For homogeneous networks such as Poisson random graphs, such that the fluctuation in is standard. Strikingly, for highly heterogeneous networks such as scale-free networks with , we have and thus the fluctuation scaling of with is anomalous. Moreover, we suggest that obeys a large deviation principle [44], , where is so-called large deviation rate function which plays a role of microcanonical entropy of the network configuration model [45, 46, 47, 48]. is the most probable value of , and is just mentioned that is the exponent scaling the ’s with .
II Multi-Canonical ensemble Monte Carlo sampling
The configuration model is an ensemble of random graphs with a given degree sequence , where is the degree of node and is the number of nodes. The model was formulated by Bollobás [49], inspired by Ref.[50]. It was popularized by Newman, Strogatz, and Watts [51], who realized that it is a useful and simple model for real-world networks. The configurations networks are generated as follows. Firstly, each node is assigned a given number of half-edges equal to its observed degree , with assumed to be even. Each half-edge is then connected to a randomly chosen other half-edge to form an edge in the graph. Finally, all the self-loops and all the parallel edges between two different nodes are removed by an algorithm to reshuffle edges that ensures the degree distribution unchanged. It was pointed out that the algorithm produces a bias in resulting network configurations [52]. Such a bias can be eliminated by a refusal algorithm [53], but the latter is more computationally time-consuming. However, it does not produce any effect in our model whether algorithm is applied. This is because that the first generated network is only used as the starting point of Monte Carlo sampling introduced below. In the long time, the results do not sensitive to the initial configuration.
We consider a Markov Chain Monte Carlo (MCMC) algorithm in which we weight each network configuration with a Boltzmann weight , where is the adjacency matrix of the underlying network whose entries are defined as if nodes and are connected and otherwise, and is the assortativity coefficient of the network . To perform the MCMC, we consider the elementary edge swap moves that preserve the degree distribution of the network. We consider four different nodes, , and one of the three invertible moves in which the following edge swaps are performed [54, 55]
[TABLE]
In order to perform any of these three moves the initial two links between the four nodes must be present in the network while the final two links must be absent or vice versa as multiple edges between two different vertices are forbidden. Since not all moves are accepted by the algorithm the MCMC algorithm should take into account the fact that some network configurations might allow more moves than others. In Fig. 1(a), we show a simple graph of four nodes with two edges. There exist two possible configurations to move by edge swaps. However, if an additional edge is introduced between nodes 1 and 3 (see Fig. 1(b)), one of the resulting move configurations is forbidden since the parallel edges are present.
We indicate with the number of edge swaps allowed if starting from adjacency matrix . Each single allowable edge-swap move is accepted by the Metropolis probability which ensure unbiased sampling of the network configurations
[TABLE]
where is the inverse temperature played a role of a conjugated field acting on the assortativity coefficient . Generally, for a larger , the sampled networks prefers to smaller values of , and thus can be used to adjust the bias on sampling assortativity coefficient. is the change in the assortativity coefficient due to the edge-swapping trial. Therefore at each step the algorithm selects a value of with uniform probability and draws four nodes until the move is allowed. It then accepts the allowed move with probability .
We note that admits the following expression
[TABLE]
This expression can be used to calculate at the beginning of the MCMC algorithm. In order to calculate how changes at each step of the MonteCarlo step it is more convenient to consider the expression
[TABLE]
Indeed using this expression one can just write
[TABLE]
where can be calculated by considering only the terms that change in Eq. (5)
In fact, the term in Eq.(3) is very close to one since is the order of square of the number of edges, , and thus the deviation of from one is the order of . Therefore, dropping such a term is expected to generate not much effect to the results, but it is bound to improve computing efficiency significantly. We have tested several networks and found that the results are consistent whether the term exists or not.
Similar procedure was also used to study the relation between degree correlations and other topological features such as clustering coefficient [56] and percolation property [57]. For a given inverse temperature , the probability density of generating a network with the assortativity coefficient follows the Boltzmann distribution [58, 59, 60],
[TABLE]
where is probability density function of we want to obtain, and is the partition function (normalized factor) at the inverse temperature . In practice, can be obtained by performing MC simulations at . To that end, we build a histogram of the number of times out of that an interval is observed, and thus we have
[TABLE]
In simulations, we have performed (with being the size of the underlying network) trials for edge swaps and the last trials are used to count bins of histogram of . Using Eq. (8), Eq. (7) can be rewritten as
[TABLE]
The MHR method takes advantage of collecting a series of histograms at nearby temperature overlap. We perform a series of MC simulations in the canonical ensemble corresponding to different inverse temperature with , where is chosen uniformly from the interval . The improved estimate for is given by [61]
[TABLE]
where the partition function can be found self-consistently by iterating the following equations,
[TABLE]
During the iterations for Eq.(II), we have used a rescaling of -values (divided all by the smallest) after each step to avoid an overall growth.
Once the is obtained, we can compute the th moment of the assortativity coefficient ,
[TABLE]
In particular, is the expected value of , and is the variance of .
III Poisson random graphs
We first consider the Poisson random graphs whose degree distribution follows with average degree . In Fig. 2, we show the logarithm values of for several different . Using the MHR method, the probabilities as small as are easily accessible. As increases, the width of the distribution of becomes narrower. The typical value of , i.e. the most probable value of corresponding to the maximum in , is very close to zero. To investigate the size effect of in more detail, we have computed the expected value and the variance of of as a function of . We find that is always negative for all the ’s and decays in magnitude with . As shown in Fig. 3(a), the minus can be well fitted linearly with in the log-log plot, , with the exponent . In Fig. 3(b), we show that decreases with in a power-law way as well, , with the exponent that is very close to one. This implies that the fluctuation of on Poisson random graphs is inversely proportional to the system size , in accordance with the central limit theorem.
Next, we want to check whether the obeys a large deviation principle. To that end, we first make a shift in such that the locations of the maximum in coincide for all the ’s. We then scale the logarithm of ’s with providing that the obeys a Gaussian form around . Thus, we suggest a form of , where is the large deviation rate function that is convex and possesses its unique minimum at . Finally, we make a shift on so that at , which is often done because only makes sense for . This suggestion is verified in Fig. 4, in which one can see that all the curves for each coincide not only near , but also far from .
IV Scale-free networks
We now consider the case of scale-free networks whose degree distribution follows a power-law function, , where is the minimal degree, and is degree distribution exponent. Here we focus on the range . The maximal degree is chosen by a natural cutoff, such that . In Fig. 5, we shows the logarithm of for three different (a), 2.5 (b), 3.0 (c) and for five different ’s. It can easily seen that for all cases are always unimodal. All the expected value of are negative, . This is especially obvious for smaller . With the increment of , moves to zero gradually. In Fig. 6(a), we show that can be well fitted by the form of . The exponent is dependent on , which is , 0.214, and 0.443 for , 2.5, and 3.0, respectively. The fluctuations of , , obey the scaling law as well, , as shown in Fig. 6(b). The exponent decreases as increases, which is , 1.28, and 0.99 for , 2.5, and 3.0, respectively. That is to say, for highly heterogeneous networks, they exhibit anomalously small fluctuations in , since implies that the fluctuations decay with faster than the standard scaling.
In Fig. 7, we show the large deviation functions for scale-free networks. As mentioned before, the large deviation functions are obtained by . As expected, all the data coincide for different .
V Configuration network model with soft constraints
Finally, we shall compare the scaling behavior of the assortativity coefficient between two different ensembles of configuration model. The first one, as we studied before, is microcanonical, in which degree sequence are fixed. The second one is canonical ensemble that is easier to handle mathematically, and it is called the exponential random graph model in network science [58, 59, 60]. In the canonical ensemble, the hard constraints in microcanonical ensemble are softened by enforcing only as expected values, i.e. for . The canonical probability of a graph is written as [58, 59, 60, 62, 63, 64]
[TABLE]
where is the graph Hamiltonian defined as
[TABLE]
and the normalizing quantity is partition function that can be calculated exactly,
[TABLE]
Substituting Eq. (14) and Eq. (15) into Eq. (13), can be written as the mass probability function of a Bernoulli-distributed binary random variable (adjacency maxtrix),
[TABLE]
with success probability
[TABLE]
where is called fugacity that can be obtained numerically by solving constraint equations,
[TABLE]
In Fig. 8, we compare the results of canonical scale-free model with those of the microcanonical scale-free model for three different values of . For each and each , we generate at least 5000 realizations of canonical configuration networks according to Eq. (17) to obtain mean value and variance of , in which the expected values of node degrees are the same as the degree sequence in microcanonical configuration networks. In canonical model, one can see that both and decay with power-law as increases. On the one hand, the values of are almost independent of specific ensemble and share the same scaling exponent . On the other hand, the values of in canonical model are always larger than those in microcanonical model. This is especially obvious for smaller values of . The result is as expected because in canonical model the degree of each node is fluctuating from one network realization to another. For and , the scaling exponents are almost the same in the two ensembles. However, for , in canonical model is less than 1.59 in the microcanonical model.
In Fig. 9(a) and Fig. 9(b), we show the scaling exponents and as a function of , respectively. In the two ensembles increases monotonically as increases. When , are almost the same, and when , in canonical ensemble is slightly larger. However, changes with in two different trends. When , in the two ensembles are almost the same, and remains constant around one when . For , in microcanonical model are obviously larger than those in canonical model. For example, for we have in microcanonical model and in canonical model. From Fig. 9(b), one can see that when the scale-free networks start to share the same scaling exponents as the Poisson-distributed random graphs. Intuitively, it seems to be relevant to the divergence of the second moment of the degree distribution on scale-free networks with . It may be hopeful to establish this possible connection in the exponential random graph models as it is easier to handle mathematically in the canonical ensemble. We have realized that in a recent paper [64], the authors used the two-star model [65] to study degree correlations between the nearest and next nearest neighboring nodes. They analytically calculated the degree assortativities and showed that they are nonmonotonic functions of the model parameters, with a discontinuous behavior at a first-order transition. However, in the work the authors did not observe a broad degree distribution such as power law form that are properties of many empirical networks. Therefore, it is still a challenging problem.
VI Conclusions
In summary, we have used the MHR method to obtain the probability distribution of the assortativity coefficient on configuration networks. This method enable us to obtain the rare-probability tails of within the allowable computational time. We show that satisfies a large deviation principle after a shift in , , in which is the large deviation rate function that is convex and possesses its unique minimum at . We find that in Poisson random graphs and scale-free networks with , indicating a normal fluctuations scaling of with in such networks, . Interestingly, for , showing an anomalously fast decay in the fluctuation of as increases. Such an anomalous phenomenon in time-consuming observables have also been found in some other systems [66, 67, 68, 69, 70]. Furthermore, we show that in the canonical ensemble is slightly greater than one for but is obviously less than that in the microcanonical model. This suggests that the anomaly in fluctuations of is not very significant in the canonical ensemble.
In the future, it is worthy investigating the joint distribution of assortativity coefficient and other topological observables, such as the average shortest path length, the largest eigenvalue of adjacency matrix or the second smallest eigenvalue of the Laplacian matrix, using the MHR method. This will surely deepen the understanding of the role of degree assortativity on dynamical precesses on configuration networks [16, 17, 18, 19, 20, 21].
Recently, we have noticed that large deviation theory has been used to uncover atypical structural and dynamical characteristics of complex networks, such as a first-order percolation transition subject to a rare initial damage [71, 72], a first-order phase transition in the condensation of node degrees [73], localization transitions [74, 75, 76] and optimal paths [77] of dynamical observables in random walk model , and epidemic extinction [78, 79, 80] and spin model [80, 81]. In the future, we believe that large deviation theory and related rare-event simulation methods may inspire more research works in network science.
Acknowledgements.
We acknowledge supports from the National Natural Science Foundation of China (Grant Nos. 11875069, 11975025, 12011530158, 61973001) and the Key Scientific Research Fund of Anhui Provincial Education Department (Grant No. KJ2019A0781))
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Newman [2010] M. E. J. Newman, Networks: An Introduction (Oxford university press, 2010).
- 2Dorogovtsev et al. [2008] S. N. Dorogovtsev, A. V. Goltseve, and J. F. F. Mendes, Rev. Mod. Phys. 80 , 1275 (2008).
- 3Arenas et al. [2008] A. Arenas, A. Díaz-Guilera, J. Kurths, Y. Moreno, and C. Zhou, Phys. Rep. 469 , 93 (2008).
- 4Rodrigues et al. [2016] F. A. Rodrigues, T. K. Peron, P. Ji, and J. Kurths, Phys. Rep. 610 , 1 (2016).
- 5Pastor-Satorras et al. [2015] R. Pastor-Satorras, C. Castellano, P. Van Mieghem, and A. Vespignani, Rev. Mod. Phys. 87 , 925 (2015).
- 6Castellano et al. [2009] C. Castellano, S. Fortunato, and V. Loreto, Rev. Mod. Phys. 81 , 591 (2009).
- 7Perc et al. [2017] M. Perc, J. J. Jordan, D. G. Rand, Z. Wang, S. Boccaletti, and A. Szolnoki, Phys. Rep. 687 , 1 (2017).
- 8Boccaletti et al. [2006] S. Boccaletti, V. Latora, Y. Moreno, M. Chavez, and D.-U. Hwang, Phys. Rep. 424 , 175 (2006).
