An ensemble of random graphs with identical degree distribution
Fei Ma, Xiaoming Wang, Ping Wang

TL;DR
This paper demonstrates that networks with identical degree distributions can have vastly different topological structures and explores the diversity and phase transitions within ensembles of such graphs, revealing their complex properties.
Contribution
It introduces an ensemble of random graphs with a fixed power-law degree distribution and analyzes their structural diversity, phase transitions, and bounds on spanning trees.
Findings
Networks with same degree distribution can have different topologies.
The ensemble size appears large in the thermodynamic limit.
Identifies phase transitions in graph properties over time.
Abstract
Degree distribution, or equivalently called degree sequence, has been commonly used to be one of most significant measures for studying a large number of complex networks with which some well-known results have been obtained. By contrast, in this paper, we report a fact that two arbitrarily chosen networks with identical degree distribution can have completely different other topological structure, such as diameter, spanning trees number, pearson correlation coefficient, and so forth. Besides that, for a given degree distribution (as power-law distribution with exponent discussed here), it is reasonable to ask how many network models with such a constraint we can have. To this end, we generate an ensemble of this kind of random graphs with (), denoted as graph space where probability parameters and hold on ,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
**An ensemble of random graphs with identical degree distribution
** Fei Ma*a,111 The author’s E-mail: [email protected]. , Xiaomin Wanga,222 The author’s E-mail: [email protected]. and Ping Wangb,c,d,*333 The corresponding author’s E-mail: [email protected].
a School of Electronics Engineering and Computer Science, Peking University, Beijing 100871, China
b School of Software and Microelectronics, Peking University, Beijing 102600, China
c National Engineering Research Center for Software Engineering, Peking University, Beijing, China
d Key Laboratory of High Confidence Software Technologies (PKU), Ministry of Education, Beijing, China
Abstract: Degree distribution, or equivalently called degree sequence, has been commonly used to be one of most significant measures for studying a large number of complex networks with which some well-known results have been obtained. By contrast, in this paper, we report a fact that two arbitrarily chosen networks with identical degree distribution can have completely different other topological structure, such as diameter, spanning trees number, pearson correlation coefficient, and so forth. Besides that, for a given degree distribution (as power-law distribution with exponent discussed here), it is reasonable to ask how many network models with such a constraint we can have. To this end, we generate an ensemble of this kind of random graphs with (), denoted as graph space where probability parameters and hold on , and indirectly show the cardinality of seems to be large enough in the thermodynamics limit, i.e., , by varying values of and . From the theoretical point of view, given an ultrasmall constant , perhaps only graph model is small-world and other are not in terms of diameter. And then, we study spanning trees number on two deterministic graph models and obtain both upper bound and lower bound for other members. Meanwhile, for arbitrary , we prove that graph model does go through two phase transitions over time, i.e., starting by non-assortative pattern and then suddenly going into disassortative region, and gradually converging to initial place (non-assortative point). Among of them, one null graph model is built.
Keywords: Random graphs, Degree distribution, Small-world, Spanning trees number, Assortative mixing..
1 INTRODUCTION
The last few decades have seen an increasing interest of studying complex networks in a variety of disciplines, including statistic physics, applied mathematics, theoretical computer science, biology as well chemistry, and so forth. In general, a network is a collection of discrete terms, named vertices (or nodes), connected by edges (or lines). The widely studied network models consist of the World Wide Web (WWW), the Internet, sexual contract network, scientist cooperation networks, friendship networks, protein interaction networks, metabolic networks [2]-[7]. Different from classic random graph models with poisson distribution provided by Erdös and Rènyi [8], in some context, these real-life networks mentioned above appear to follow the highly skewed degree distribution, commonly known as power-law distribution in form. This type of distribution suggests that there are a small number of vertices possessing a large fraction of connections in a network, and vice versa.
Since then, degree distribution (degree sequence) can be regarded as a simple yet useful measure to answer whether a given network model is power-law or not. On the other hand, for a given degree distribution (degree sequence), how to rebuild a least one corresponding network model is of important interest and has been paid more attention. More generally speaking, it is not easy to produce a better model meeting a given degree sequence. Hence, some helpful techniques have been developed. Among of which, the popularly utilized method is generating function, , based on its own advantages, for instance (1) average degree , (2) probability , and (3) higher moment , and so on [9]. It is hesitated for ones to make only use of degree distribution to describe complicated networks. Put it another way, it is likely for some network models with identical degree distribution to behave different topological structure among them. Here, to further concrete this assertion, we build a graph space where probability parameters and hold on in which each member obeys a unique degree distribution with . As we will show shortly, there in practice are some pronounced differences among all the members in graph space . For example, it is convenient for one to achieve a transformation of a small-world member into a larger one by just tuning probability parameter . For each time step , graph model always has pearson correlation coefficient in comparison with other different members of graph space in the thermodynamic limit. Different topological structures might lead to some non-similar structure features. Among of various of features, we here just focus on spanning trees number and obtain an inequality relevant to spanning trees numbers of all the members of graph space .
As said in Ref [10], consider a network, its pearson correlation coefficient must belong to a range between and and positive value represents this network has assortative mixing, negative value for disassortatively connected network, and value [math] for intermediary. However, during a finite number of time steps in question, the “null” graph model seem to disprove the above statement because although so clear to see vertices with great degree connected by a number of small degree vertices, its pearson correlation coefficient are a unchanged constant, namely . Maybe our finding can help one to understand the relation between pearson correlation coefficient and network structure properties deeply.
This paper can be organized by the following several Sections. In Section 2, in order to successfully build our graph space whose each member behaves a designed degree distribution with , we introduce two types of operations, type-A operation and type-B operation. After that, Section 3 is made up of three subsections in which we mainly discuss diameter, spanning trees number and pearson correlation coefficient, respectively. For the concrete outline of this paper, we have to close this paper by making an elaborated conclusion and bringing some discussions associated with our graph space for future work in the last section.
2 OPERATIONS AND CONSTRUCTION
In this section, we will generate a class of random graphs with identical degree distribution which can be used to span a graph space, denoted by . Here, both probability parameters and hold with , and represents time step. Put it another way, for two arbitrary members and at random chosen from graph space , they must have same degree sequence as each other. To do this, we need to introduce two well-studied operations, called in this paper type-A operation and type-B operation respectively, which are described in more detail, as follows
Type-A operation For a given edge with two vertices and , bringing an edge on vertices and and then connecting vertex with and with using two new edges, respectively, produces a cycle . Such a process is Type-A operation, also defined as rectangle method in our previous works [11], shown in Fig.1(a).
Type-B operation For a given active edge with two vertices and , bringing two vertices and , connecting vertex with two endpoints of edge by two new edges and similar connections for vertex and vertex pair and , as well deleting active edge , together produces a cycle . Such a process is Type-B operation, also commonly defined as fractal method or as diamond method, seeing Fig.1(b).
It in reality is not necessary to distinguish the two cycles obtained from both operations introduced above from the theoretical point of view. Nonetheless, we in this paper prefer to refer to the former cycle as a rectangle and the later as a diamond. Only reason is able to organize the outline of our work conveniently. Isomorphism between them in structure can not guarantee identical function taking place on them as we will depict shortly. Put this in mind and keep on following.
There are two popular manners in which most of pre-existing graph models can be built up. One is to first construct graphs using some rules except for degree sequence (degree distribution) and next to study their own properties including degree sequence. Such well-defined examples have scale-free BA-model due to [3], small-word WS-model by [2], Model of Apollonian Networks [12], sierpinski networks [13]. The other is, for a given degree sequence, to establish an available graph consistency with that designed degree sequence. From their appearance, the two process above may be thought of being inverse. The later in theory is much difficult than the former. So far, some useful algorithms and methods have been provided to do so. Included studies have generating function [10] and algorithms [14]. We attempt to construct our graph space with identical degree distribution in the later manner based on a given degree sequence. Degree of a vertex is the number of vertices which it is connected, denoted by .
As shown in most natural and man-made complex networks, small-world property and scale-free feature are highly prevalent. In order to protray such two characters and to study function occurring over networks of this type, a large number of models have be generated. For our purpose, this paper focuses mainly on a graph space whose each member follows a given power-law degree distribution in form
[TABLE]
where symbol represents a probability for choosing at random a vertex with degree equal to from the entire network. Taking into account the background work done by Barabasi and Albert making using of the mean-field theory, we make an equivalent assumption that let power-law parameter be equal to not to anything else. In general, it is not easy to reconstruct a graph obeying power-law degree distribution with a given value . A widely used technique is generating function
[TABLE]
There are an increasing deal of researches relevant to constructing graphs utilizing generating function, refer to [10].
Taking useful advantage of two types of operations and generating function, let us now turn our sight into building up graph space .
CONSTRUCTION
First of all, the seminal graph is a cycle . To introduce various graphs into graph space , we employ randomization method, namely using a pair of probability parameters and satisfying . Thus, in order to obtain the next graph from (), one just need to apply type-A operation to each edge of graph with probability or apply type-B operation to each edge of graph with complementary probability , shown in Fig.2. As studied in previous literatures [15], our graph space , after time steps, will consist of a single element at or and becomes a deterministic graph whose some topological properties of interest have be discussed in detail. To make this paper much self-contained, we simply list a few common properties shared by both deterministic graphs and as follows. Graphs and both have same vertex number (order), , and identical edge number (size), . And then, they are sparse in terms of a small average degree close to . With the help of evolution process of two graphs described above, it can be easily seen that they also follow a unique degree distribution with a constant exponent as assumed by us previously. It is worth noting that both sparsity and power-law degree distribution can in general be found in variety of complex networks around us. If one only concerns on the two features, then graphs and appear to be reasonable candidates. On the other hand, as shown in recent researches, most networks have a great degree of transitivity or clustering, i.e., there is a high probability that two of you friends are also friends of one another . Unfortunately, graph models and have no clustering by virtue of nonexistence of triangle. In order to better mimic real-life complex networks, graph families with tunable clustering are being discussed in our another paper [16]. Here we aim at studying graphs with identical degree distribution and some of their applications in topological terms.
More generally, each member of our graph space obeys similar power-law degree distribution to the proceeding two deterministic models and . One of most important reasons for this is that the generation model using type-A operation has a topological structure with that of type-B operation in common, i.e., resulting model is a cycle . Clearly the two swap procedures preserve the degree sequence. Hence this probability parameter has no influence on degree distribution of each member of graph space but on other topological structure parameters over each member as we will show shortly. This is why we want to build graph space in which although having same degree distribution, all members are different from each other under other topological terms. Except for that, what’s more, there is very likely for a given probability parameter and a provided time step to construct various in structure. What we present here will be significant constituents in the following sections.
3 PROPERTIES AND DISCUSSIONS
In the last few decades, there are an increasing number of literatures published to unveil intrinsic characters behind complex networks and to understand many functions of interest taking place over this types of networks. Among of them, the scale-free feature and small-world property are two prominent findings. As mentioned in Section 2, each member of graph space follows a unique power-law degree distribution and displays scale-free feature with respect to two vital mechanisms introduced by Barabasi and Albert. However small-word property describes another apparent phenomena that in most case, there seems to be many connections of short length linking two arbitrary persons chosen randomly from our living world. Nowadays, this interesting fact has be called the six-degree separation theory and accepted by people. Mathematically, one might find out a few short length connections of a source vertex to an arbitrary vertex in a connected complex network modeling real-life relationships among people in society. The length of such a connection between a pair of vertices is usually defined as distance. From the graph theory point of view, the diameter is regarded as the maximum of all distances between any two vertices in a connected and undirected graph. Diameter is itself a feature of a graph topology and can be simply used to measure information delay over a network. Particularly, in information science, diameter suggests a transmission efficiency of information over the whole network. More usually, the greater diameter is, the poorer transmission efficiency is. Therefore, let us first study diameter both analytically and experimentally.
DIAMETER
Before proceeding, let us focus on two simplest cases, namely deriving exact expressions to diameters of both deterministic graph models and , respectively.
Considering, in the process of generating by , only applying type-A operation to each edge of , the diameter of graph model must be spanned by virtue of graph model own diameter . By definition of diameter, take a path with length precisely equal to from graph model and denote it by . After that, based on producing , we make using of type-A operation on each edge () and obtain a length path, similarly for arbitrary path of various length in graph model . So we may write a relationship between and , i.e., . With the condition , one immediately captures a closed-form solution to diameter , that is, . Armed with a fact , we find out a connection between diameter and vertex number according to . In other words, in the limit of large graph size, the diameter of is much smaller than its own order and only scales logarithmically. Such a phenomena can be easily found in a great number of real-life networks, indicating that graph model is like such real-life networks and has small-world feature.
By analogy with the development of , one can write a recursive connection of diameter in graph model to on as
[TABLE]
Compared to type-A operation, the function of type-B operation is to replace any edge by a length path which does account for the prefactor in the right-hand side of Eq.(1). Plugging this initial value into Eq.(1) yeilds a solution to diameter , . Different from that case of graph model , diameter is not approximately equal to but to a square root value of the vertex number of graph model , directly indicating is large-scale.
So far, we in practice study two specified cases and our discussions about diameters of other members of graph space still remain unanswered thoroughly. One of most important reasons for this is to introduce probability parameter into the generation process of graph. Even though for given parameters and , there may be a lot of generation graphs with various topological structure parameters including diameter except for degree distribution. So we on computer manipulate simulation for the dynamical tendency of diameter over time. Due to space and time memory, we make a reasonable constraint, setting and . The resulting diagram is illustrated in Fig.3.
As is clear from Fig.3, the diameter averaged over graph space as a whole sharply diverges with decreasing value . In order to see clearly the influence from probability parameter on diameter, we give three specified values with , , and then run our algorithm for each value exactly times, see Fig.4. With the help of the three diagrams, we attempt to capture an appropriately analytical solution to diameter. Different from the previous calculations for both deterministic graph models, we need to consider three contributions into the change of diameter in the process of obtaining graph model from by implementing both type-A and type-B operations in question. We first choose at random an arbitrary path of length equivalent to , denoted by . Such a choice is reasonable due to our assumption that each edge is taken use of type-A or type-B operations independently. Case 1. If one applies type-A operation to the two edges and simultaneously, then . Case 2. Only one of the two edges and is manipulated by type-A operation, then . Case 3. Neither edge nor edge are not changed using type-A operation and hence one can write . On the one hand, fortunately, we always encounter case 3 at each time step and can obtain a recursive equation to diameter, . On the other hand, we may derive another expression for diameter, namely , under the situation of case 1. Here and . Due to lying in range , either of the two approximate solutions above indicates that diameter grows exponentially with increasing time . To show this, we in appendix A provide two diagrams.
Compared to two extreme cases, and , for arbitrary probability parameter meeting , there appears to be . It’s too early to make such a conclusion. It is not hard to find out a counterpart, refer to Appendix A for detail. However, with some additional conditions, such as the limit of large graph size, we indeed arrive at the following proposition about both upper bound and lower bound of the diameter of each member belonging to graph space .
Proposition 1 For any member of network space with where threshold value is larger enough than , its diameter must follow the following inequality in the limit of large graph size
[TABLE]
From Eq.(2), it is worth noting that for a given parameter with the condition , only if two distinct probability parameters and meet a inequality , then in the thermodynamic limit the ratio and will be either infinite or zero according to importance limit , indicating a smaller change on parameter will make two graph models considerably different, at least in this case of diameter.
In a words, for an arbitrary member of network space , Type-B operation can make diameter change more severely than Type-A operator by both analytically and experimentally. By varying value from to [math], the small-world graph model is abruptly transformed into another type of graphs whose scale is larger. As said in section 2, degree distribution can not play a role to better distinguish each member of graph space . By our discussion here, diameter seems to be an available measure used to do so. In addition, there are some helpful indices for understanding the difference among graphs with identical degree distribution. By contrast, the choice of other types of measures is a matter of convenience. The remainder of this section is to introduce the other two topological structure parameters, spanning trees number and pearson correlation coefficient.
SPANNING TREES
Spanning trees number is always considered as an important structure invariant relevant to several kinds of dynamical functions on networks, for instance reliability [17], synchronization capability [18], random walks [19]-[21], to name just a few. Hence, in the past most attentions have been focused on enumerating the number of spanning trees of special network models. As before, let us put our sight into two particular cases, and .
According to our previous results in Ref [22] , for deterministic graph model , we can write
[TABLE]
where is spanning trees number and the number of 2-(u; v)-forest, refer to Ref [22].
With the help of some simpler arithmetic, one can obtain an closed-form solution to spanning trees number as follows
[TABLE]
Analogously, a group of equations between and can be read
[TABLE]
Under the initial condition , an exact expression for is
[TABLE]
Apparently, graph models and both have a unique degree distribution but possess distinct spanning trees number. In such a situation, degree sequence can not be used as a reliable index to distinguish them but spanning trees number seems to be a better replacement.
There appears to exist a connection of to because of their underlying graphs both having identical degree distribution. Indeed, based on Eqs.(4) and (6), one can easily see the following equation
[TABLE]
Eq.(7) provides a strong proof to a fact that for an arbitrary member of network space , Type-B operation can produce more spanning trees than Type-A operation and hence plays a most significant role in the process of constructing network models. Intuitively, consider both and , we might state that spanning trees number is a better measure distinguishing each member of graph space than diameter.
Similarily, we immediately arrive at the second proposition about both upper bound and lower bound of the total numbers of spanning trees of each member belonging to network space .
Proposition 2 For any member of network space , its spanning trees number must satisfy the following inequality
[TABLE]
As known, given a pair of parameters and , there will be a great number of candidates being in network space . Hence we are able to assert that the coordinate of network space should be too large in the limit of large generation to enumerate them even utilizing current computers. The ratio described in Eq.(7) however is smaller than the coordinate , we are convinced that there must exist two distinct members and with identical spanning trees number. If so, spanning trees number will not be adequate to differentiate some members of network space . Other useful indices should be adopted to do this. Therefore looking for other new measures will an interesting research topic in the neat future and is also one of our present focuses.
PEARSON CORRELATION COEFFICIENT
To further put forward our task to better distinguish difference among members of network space , this subsection mainly focuses on another topological structure parameter, the correlations between properties of adjacent vertices, known as pearson correlation coefficient [23].
Recent studies have proven that for many real-world networks, the degrees of vertices at either endpoint of an edge chosen randomly are not independent, but are correlated with one another. With the help of pearson correlation coefficient , all most of social networks are turn out to have positive value and hence are assortatively constructed. On the other hand, non-social networks, such as technological and Biological networks, have disassortative mixing pattern. For convenience and our purpose, below is a brief introduction to knowledge relevant to pearson correlation coefficient [23].
More commonly, a normalized assortativity coefficient can be obtained, as follows:
[TABLE]
here is the fraction of edges running vertices of degree and in a network, is equal to and is the standard deviation of the distribution . For brevity, Eq.(9) is usually translated into the next form
[TABLE]
suggesting that seems to satisfy .
Along the same research line as the two subsection above, we first study graph models and due to their own deterministic structure. For , we can group all edges into classes in a straightforward manner. Similarly, the total edges of graph model () can be classified into families. Taking such classifications into algorithm for computing yields an illustration plotted in Fig.6. For probability parameter being in this range between [math] and , we also run algorithm and then obtain the next three panels from three different viewpoints, see Fig.7. With the help of both Fig.6 and Fig.7, we may say that for the first several time steps, parameter can have considerable influence on pearson correlation coefficient and makes the growth tendency of non-monotonous but fluctuant. In the thermodynamic limit, will become more and more smooth and ultimately trends to critical value zero. (As a guide for the eye, three specified pearson correlation coefficients are plotted in Appendix B.) By numerical simulations and theoretical analysis, proposition 3 is vaild.
Proposition 3 For any member of network space , in the limit of large graph size, its pearson correlation coefficient must follow the following inequality
[TABLE]
Similar discussions are suitable for pearson correlation coefficient . Although all the members of our network space have a unique power-law degree with exponent , they still display different behavior in this case of pearson correlation coefficient.
4 CONCLUSION
Recent studies of network structure have concentrated on a few of properties that appear to be popular in a great variety of complex networks. Among of them, the best studied are the scale-free feature and small-world property. Along such a research guide, to emphasis degree distribution in some cases can not perfectly distinguish many studied networks in question, we build a graph space using a given degree distribution whose each member follows power-law distribution with exponent equivalent to a constant . Under such a situation, there are three simple yet useful indices, namely, diameter, spanning trees number together with pearson correlation coefficient, adopted to better identify difference among each member of graph space . By making a comparison in depth, we find out that this randomization parameter not only brings various types of graph models but also makes almost all graph models distinct with one another. By availably tuning value , one can transform a small-world graph into an opposite one, further increase corresponding spanning trees number, simultaneously change assortative mixing over graph, i.e., from non-assortative mixing to disassortative pattern and ultimately return to assortative behavior.
Here, we want to state that network space studied is just an available example for showing, in some cases, degree distribution (degree sequence) is not well adequate to meet ones expectation. Hence, other helpful evaluating indicators are necessary and should be developed. Although network space has acted as a good role to imply some disadvantages of degree distribution, there exist a few flaws in its own structure. As known, cluster (or community) is prevailing in most of real-world networks, whereas no clustering phenomena is on our network space so that it is not suit to mimic real-world networks and to study dynamic function taking place on them. To accomplish a development of this model, we are doing so in our another manuscript which is out the scope of this paper. In addition, of important interests will be discussed on network space in the next future, including random walks, synchronization, percolation.
ACKNOWLEDGEMENTS
The research was supported by the National Key Research and Development Plan under grants 2016YFB0800603 and 2017YFB1200704, and the National Natural Science Foundation of China under grant No. 61662066.
COMPLEMENTARY MATERIAL
Appendix A Illustration of both simulation and analytical value relevant to the logarithm of diameters of graph models , and as a function of parameter .
Appendix B Illustration of three specified pearson correlation coefficients. They are in turn , and as a function of time step .
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1]
- 2[2] D.J. Watts, S.H. Strogatz. Nature. 393 (1998): 440-442
- 3[3] A.-L. Barabási, R. Albert. Science. 5439 (1999): 509-512
- 4[4] B. Karrer, M. E. J. Newman. Phys. Rev. E. 84 (2011)036106
- 5[5] L.Y, Zhang, J.L. Ren. J. Stat. Mech. (2019)033204
- 6[6] K. Hu, J.B. Hu, L. Tang. et al. J. Stat. Mech. (2018)100001
- 7[7] W. Liu, L.Y. Ma, B. Jeon, L. Chen and B. Chen. Journal of Theoretical Biology. 455 (2018): 26-38
- 8[8] P. Erdös, A. Rényi. Publ. Math. 6 (1959)290
