Ground States for Exponential Random Graphs
Rajinder Mavi, Mei Yin

TL;DR
This paper introduces a perturbative approach to estimate normalization constants in exponential random graph models and provides evidence of phase transition phenomena in the edge-triangle model.
Contribution
It presents a novel perturbative method for normalization estimation and demonstrates discontinuity in model behavior at critical parameter values.
Findings
Perturbative method effectively estimates normalization constants.
Evidence of phase transition in the edge-triangle model.
Discontinuity observed along critical directions.
Abstract
We propose a perturbative method to estimate the normalization constant in exponential random graph models as the weighting parameters approach infinity. As an application, we give evidence of discontinuity in natural parametrization along the critical directions of the edge-triangle model.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3| bipartite | 0.999999998 | 5.0197 | almost 1 | 0.069 | 5.019 | |
|---|---|---|---|---|---|---|
| tripartite | 5.0022 | 0.9943 | 0.000064 | 5.0021 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Ground states for exponential random graphs
Rajinder Mavi
[email protected] math.msu.edu/~mavi
Department of Mathematics
Michigan State University
619 Red Cedar Road
C212 Wells Hall
East Lansing, MI 48824, USA
Mei Yin
[email protected] http://www.cs.du.edu/~meiyin/ Department of Mathematics
University of Denver
C.M. Knudson Hall, Room 300
2390 S. York St.
Denver, CO 80208, USA
Abstract
We propose a perturbative method to estimate the normalization constant in exponential random graph models as the weighting parameters approach infinity. As an application, we give evidence of discontinuity in natural parametrization along the critical directions of the edge-triangle model.
exponential random graphs; perturbation analysis; phase transitions; critical directions
pacs:
64.60.aq; 05.20.Gg
I Introduction
Over the last decades, the availability of network data on typically very large scales has created the impetus for the development of new theories and methods for modeling and describing the properties of large networks. The introduction of exponential random graphs has aided in this pursuit, as they are able to capture a wide variety of common network tendencies by representing a complex global structure through a set of tractable local features. From the point of view of extremal combinatorics and statistical mechanics, investigations have been focused on the variational principle of the limiting normalization constant, concentration of the limiting probability distribution, phase transitions, and asymptotic structures. See for example Chatterjee and Varadhan CV2011 , Chatterjee and Diaconis CD2013 , Radin and Yin RY , Lubetzky and Zhao LZ1 ; LZ2 , Radin and Sadun RS1 ; RS2 , Radin et al. RRS , Kenyon et al. KRRS1 , Yin Yin1 , Kenyon and Yin KY , Aristoff and Zhu AZ2 , and Chatterjee and Dembo CD2 . The main techniques used in these papers are variants of statistical physics, but the elegant theory of graph limits as developed by Lovász and coauthors (V.T. Sós, B. Szegedy, C. Borgs, J. Chayes, K. Vesztergombi, …) BCLSV1 ; BCLSV2 ; BCLSV3 ; Lov ; LS , also plays an important role in the interdisciplinary inquiry. Building on earlier work of Aldous Aldous1 and Hoover Hoover , the graph limit theory connects sequences of graphs to a unified graphon space equipped with a cut metric. Though the theory itself is tailored to dense graphs, parallel theories for sparse graphs are likewise emerging. See Benjamini and Schramm BS , Aldous and Steele AS , Aldous and Lyons AL , and Lyons Lyons where the notion of local weak convergence is discussed and the recent works of Borgs et al. BCCZ1 ; BCCZ2 that are making progress towards enriching the existing theory of dense graph limits by developing a limiting object for sparse graph sequences based on graphons.
In this paper, we study the “standard” family of exponential graph models in the asymptotic regime as the exponential parameters approach infinity. As the model name indicates, we are associating exponential weights to graphical ensembles. For each let be the ensemble of simple graphs on vertices and let be the collection on all simple graphs. The exponential weights are defined in terms of subgraph densities. For any the homomorphism density of in a graph is defined as the probability that a random map on the vertex set of into the vertex set of , is edge preserving. We write the homomorphism density as
[TABLE]
Now, for all we define a probability distribution on in terms of the homomorphism density. Let be a given selection of simple graphs, where is an edge: . Let , where the components of are homomorphism densities . Given , define the functional on :
[TABLE]
and weight by . The normalization constant for the ensemble is then given by the partition function,
[TABLE]
The terminology for the partition function is borrowed from thermodynamics. In this context, is the inverse temperature. Renormalizing in (2) and fixing , one obtains a Hamiltonian . As the major contribution to the partition function concentrates around the thermal states (labeled by ), in a manner analogous to standard thermodynamic models (see Section I.2). These thermal states are well understood in the large temperature regime . For any selection of subgraphs and sufficiently small, the associated thermal state lies in the replica symmetric phase CD2013 , i.e., . On the other hand, the ground states, which are defined as the limit of the thermal states as , are not so simple. In some cases the ground state is known to be in the replica symmetric phase CD2013 , while in other cases the ground state concentrates around a simple graph in YRF .
Our motivation for this paper comes from the edge-triangle model, obtained by setting and a triangle: . It was shown in YRF that there are countably many critical directions of , along which the ground state of the model is chosen from finitely many simple graphs with some unknown distribution, and our goal is to develop a mechanism that determines which of these simple graphs is the proper ground state.
I.1 Graphon topology
The thermal states belong to the space of graph functions “graphons” which may be understood as generalizations of graphs. The set of graphs may be embedded into the space of graphons which consist of symmetric measurable functions from into ,
[TABLE]
For any and a graph , the graphon representation is the function
[TABLE]
where the interval may be intuitively thought of as a ‘continuum’ of graph vertices. The distance between graphons is given in terms of the “cut distance”, defined for , as
[TABLE]
However, a nontrivial difficulty arrises from the arbitrary labeling of vertices as they are embedded in . Thus we introduce the equivalence where , where is any measure preserving bijection. We write the quotient space of graphons under the equivalence as , and the equivalance class under of as . Incorporating the equivalence relation yields a distance
[TABLE]
where the infimum ranges over all measure preserving bijections and , making a compact metric space (see Section 9.3 of Lovász Lov ). With some abuse of notation we also refer to as the “cut distance”.
All graphons arise as the limit of some sequence of graphs. Given a graphon one may construct a graph by selecting iid points uniformly from which represent the vertices of , and then connect vertices with probability . In this context the expected subgraph density is given by the graphon homomorphism density
[TABLE]
which generalizes (1) and is continuous in the metric . Indeed, for any graph , the subgraph homomorphism density of random graphs selected by the above construction converges almost surely to the graphon homomorphism density,
[TABLE]
I.2 Exponential random graphs
As discussed above, we will define measures on in terms of subgraph densities. For define the probability
[TABLE]
where we have introduced the normalization constant (free energy density),
[TABLE]
By replacing the subgraph homomorphism density (1) with graphon homomorphism density (8) in (2), extends naturally to . Since is continuous and bounded on the compact set , there is a nonempty compact subset of so that is maximized on . Take as
[TABLE]
and then extend the domain of to by
[TABLE]
where is any representative element of . It follows from Lemma 2.1 in Chaterjee and Varadhan CV2011 that is well defined on and upper semi-continuous under the cut metric . Let be the subset of where is maximized. Then like , is a nonempty compact subset of .
For the purpose of this paper, two theorems from Chatterjee and Diaconis CD2013 (both based on the large deviation result in CV2011 ) merit some special attention. Together they connect the occurrence of a phase transition in the exponential random graph model with the solution of a certain maximization problem. The first theorem (Theorem 3.1 in CD2013 ) states that for any , the limiting normalization constant of the exponential random graph always exists and is equal to . The second theorem (Theorem 3.2 in CD2013 ) states that in the large limit, the quotient image of a random graph drawn from (10) must lie close to with high probability,
[TABLE]
Since the limiting normalization constant is the generating function for the limiting expectations of other random variables on the graph space such as expectations and correlations of homomorphism densities, a phase transition occurs when is non-analytic or when is not a singleton set. Although it is difficult to evaluate and determine the maximizing set for most , we will derive an efficient method to approximate and estimate for sufficiently far from the origin.
II The approximation scheme
Take a finite simple graph with vertex set and edge set . Consider a graphon . For each and each pair of points , define
[TABLE]
The above definition appears rather complicated, but in essence may be identified with the homomorphism density (8), if we remove edge from and do not integrate out the associated vertices and . For , define
[TABLE]
which corresponds to the total homomorphism density generated after all possible ways of removing one edge from . We give some examples to illustrate this idea. When is an edge, . When is a triangle, by symmetry, .
II.1 Motivation
We will consider as a motivating example the edge-triangle model, which is a -parameter exponential random graph model obtained from (10) by setting to an edge and to a triangle. Take for . Suppose that and , i.e., the -dimensional parameter vector is pointing towards the th quadrant. Then (which agrees with ) consists of graphons that minimize , where denotes the edge density and denotes the triangle density of , respectively. This implies that , the maximizers of (and hence of ) must lie on the Razborov curve, which is the lower boundary of the feasible region of edge-triangle homomorphism densities. As is a linear function, must minimize over the convex hull of . Since and only intersect at the points corresponding to Turán graphons, must be a Turán graphon. This important fact about the structure of was further used to derive the maximizing graphons of in YRF . Using the boundedness of , we found that consists of graphons that can be made arbitrarily close to Turán graphons when the magnitude of is sufficiently large, and exactly which Turán graphon is favored by depends on the direction of . However, as nice as these results are, there is ambiguity concerning the optimal Turán graphon along the critical directions of , which correspond to normal lines of the convex hull . The subtlety might be due to the fact that Turán graphons, though close to our optimal graphons in cut distance, are not best approximations to the maximizing set . Generalizing from Turán graphons HN , we say that a graphon is random-free if for some symmetric measurable subset of . When is finite, Chatterjee and Diaconis CD2013 showed that the maximizing graphons in are almost nowhere random-free – that is the set has zero measure. We are thus interested in finding coarse-grained graphons that are not random-free but close to in cut distance, as they are sufficient to distinguish between candidates for the ground state. Resorting to perturbation analysis in the regime, we will propose a method that keeps track of only the most significant characteristics of the maximizing graphons and demonstrate its effectiveness.
II.2 Assumptions
Let denote the range of in and suppose that the boundary is piecewise analytic. For , let denote the set of all feasible directions of at , so that implies that there is with . The (internal) tangent cone of at is then given by the closure of . See Figure 1 for an illustration of this concept in the edge-triangle model. Let for , where is sufficiently large. Suppose that (and hence as ) is maximized at a set of random-free graphons , i.e., (which agrees with ) consists of random-free graphons only. Further suppose that has the property that
[TABLE]
where is defined as in (16) and . We remark that these nice properties that we assumed are enjoyed beyond the edge-triangle model. Denote by a complete graph on vertices. Since the vertices of the convex hull of for and (any ), not just and as in the edge-triangle case, are given by Turán graphons, our argument will run through without much modification in these cases. Utilizing geometry of their respective convex hull EN , similar analysis may be extended to more general models. We point out in particular that if is a Turán graphon, then (17) holds for any due to symmetry.
II.3 Perturbation of random-free graphons
Consider a maximizing graphon . When the magnitude of is sufficiently large, since diverges while stays bounded, and can be made arbitrarily close in the cut metric. From our assumptions in Section II.2, there thus exists a random-free graphon that is close to under the cut distance. We will further construct a non-random-free graphon that is close to , simple enough, and yet still retains important information of . Most importantly, we will show that for any finite simple graph , approximates at least as well as asymptotically. The following proposition is specific for random-free graphons and will be useful for our investigation.
Proposition II.1**.**
Let denote the -norm. For a random-free graphon , .
Proof.
By (6), it is clear that . For the other direction, suppose for some symmetric measurable subset . If is a rectangle, then . The general conclusion follows once we recognize that any subset of may be approximated within by a finite union of disjoint rectangles. ∎
Let us first expand around . Denote a perturbation of by . Corresponding to the regime , as explained earlier, may be chosen so that . We have
[TABLE]
Under our assumption (17), we compute the first variation:
[TABLE]
Define
[TABLE]
Using that and , (II.3) reduces to
[TABLE]
The remainder terms (if they exist) give the homomorphism density after all possible ways of removing at least two edges from , and so are bounded above by either or . Estimating the latter is easy.
[TABLE]
which is of negligible asymptotic order when compared to the first variation (II.3). The former corresponds to the -star density of a graphon with edge density . By AK , for small , the -star density is bounded below by and above by , and the upper bound is achieved when is an anticlique of the form
[TABLE]
where . The lower bound for this remainder term is of desirable asymptotic order, but the upper bound is of the same order as the first variation. Recall that the maximizing graphon for a finite is uniformly bounded away from [math] and CD2013 . The graphon is thus likely quite different from an anticlique. This implies that the -star density of is of higher order than and does not achieve the upper bound. The phenomenon was confirmed for example by simulations for the edge-triangle model KRRS2 .
Define the averaged perturbation by , where and are given in (20). may be viewed as a flattened out version of . Since
[TABLE]
is close to (and hence ) under the distance, and by Proposition II.1, also under the cut distance. Denote by . We perform the same expansion for around as in the last paragraph:
[TABLE]
Following similar reasoning as in (II.3) and using the definition of ,
[TABLE]
This says that and agree except for the remainder terms. As for , the remainder terms for (if they exist) are bounded above by either or . Since by (24), the latter is of asymptotic order ; while the former gives
[TABLE]
and so is also of asymptotic order . We conclude that gives at least as good an asymptotic approximation for as , and a better one when the error term related to the -star density of may be dropped.
II.4 Maximizing graphons
Let for and . Under our assumptions, (which agrees with ) is a set of random-free graphons. Take . Then for and large (corresponding to ), for some , i.e., for some symmetric measurable subset of . By (18) and following analysis,
[TABLE]
Similarly, by (25) and following analysis,
[TABLE]
In both equations above, the entropy is bounded in contrast with the energy contribution . Except that is close to (and hence ) in cut distance, we do not have enough information regarding the structure of . Rather than maximizing over all possible graphons directly, we will maximize over -parameter families . From the heuristics in Section II.3, this is an effective method to approximate the optimal graphon. More than that, in certain situations (for example the edge-triangle model to be discussed in detail in Section III), we will see that keeping parameters is not only effective but also sufficient. Notice that
[TABLE]
We rewrite :
[TABLE]
Maximizing each first variation term, we have
[TABLE]
Under this choice of and and provided we can ignore the remainder terms, is strictly bigger than , which is the random-free graphon approximation corresponding to and . Let us verify that is indeed strictly bigger than by rigorously managing the error term in (31). As shown earlier in Section II.3, the remainder terms are of higher order:
[TABLE]
For any , there exists large enough so that is sufficiently small, making and . Applying these bounds in (33) gives
[TABLE]
Maximizing each first variation term as previously, this yields . We conclude that, as expected, the addition of one more parameter improves the random-free graphon estimation.
III The edge-triangle model
Denote by a complete graph on vertices. The edge-triangle model is a -parameter exponential random graph model obtained by taking an edge () and a triangle () in (10). Consider the set of all realizable values of the edge () and triangle () homomorphism densities as the graphon varies over the entire graphon space . See Figure 1. The upper boundary curve of is given by the equation , and can be derived using the Kruskal-Katona theorem (see Section 16.3 of Lov ). The lower boundary curve is trickier. The trivial lower bound of , corresponding to the horizontal segment, is attainable at any by graphons describing the possibly asymptotic edge density of subgraphs of complete bipartite graphs (Turán graphon with classes). For , the optimal bound was obtained by Razborov Razborov , who established, using the flag algebra calculus, that for with ,
[TABLE]
All the curve segments describing the nontrivial part of the lower boundary of are strictly concave. For , we set , where explicitly,
[TABLE]
is the Turán graphon with classes. Thus
[TABLE]
For , let be the line segment joining vertices and of neighboring Turán graphons. These infinitely many line segments form the convex hull of , and the length of decreases monotonically to zero as gets large.
The normal vectors to ,
[TABLE]
are the critical directions of the edge-triangle model. Let and take . While the vectors (38) are not normalized as in our derivation (see Section II), this can be easily adjusted by adapting . Then Turán graphons with and classes both belong to (which agrees with ), and we concluded in YRF that a typical graph sampled from the model may behave like either a Turán graphon with classes or a Turán graphon with classes, with no clear preference. Though already quite informative, as explained earlier in Section II.1, this result remains somewhat unsatisfactory because it does not indicate whether both such graphons are actually realizable in the limit and in what manner. Notice that underneath our investigation, there is an ordered double asymptotic framework, in the sense that the network size goes to infinity first followed by the divergence of the parameters . In hope of resolving this rather subtle ambiguity within the edge-triangle model, the “other” order was also examined in YRF , where we first let the magnitude of increase to infinity so as to isolate a simpler sub-model and then study its limiting properties as grows. Both ordered asymptotics imply a nearly identical convergence in probability in the cut metric along the noncritical directions. Under the “original” as well as the “reversed” order, there exist (possibly different) subsequences of the form , with , and for , where the edge-triangle model converges to some Turán graphon specified by the direction of the parameters . Additionally, under the “reversed” asymptotics, a detailed categorization of the limiting behavior of the edge-triangle model was obtained: When diverges along the critical direction , a typical sampled graph more likely resembles a Turán graphon with classes than with classes. Now that we are equipped with refined perturbation analysis, we would like to sharpen our results under the “original” asymptotics and inquire whether the same type of discontinuity in natural parametrization exists.
Let us make a further remark before carrying out the detailed calculations. In the physics literature, people are often interested in cases where the parameter depends on (some averages need to be satisfied for every ). In these models, in place of the normalization constant (free energy density), the relative entropy plays a central role. Analogous (but more complicated) maximization problems and concentration of measure results have been established, which lead to classifications of ensemble equivalence between the microcanonical ensemble and the canonical ensemble. The perturbative methods explored in the current paper are expected to apply in these general parameter situations. In some cases, the perturbation would still be around random-free graphons, and our argument will run through without much adaptation HMRS . In some other cases however, the perturbation would be around graphons admitting more intricate structures, and serious future work is neededGHR ; PN .
III.1 Perturbation analysis
Let . We will compare and , where is the flattened out graphon close to (Turán graphon with classes) and is the flattened out graphon close to (Turán graphon with classes). Both and are constructed with the optimal perturbation values (32). We set in our calculations below. For , we compute , , , and for and in (17), where and . As pointed out earlier in Section II, . For notational convenience, from now on we denote by and by . Then and are indicator functions associated with sets of measure and , respectively. Reconfirming our assumption in (17),
[TABLE]
which gives
[TABLE]
[TABLE]
This yields
[TABLE]
[TABLE]
We check the remainder terms for after first order perturbation. Similar analysis will work for and we skip the details. Consider the (internal) tangent cone , consisting of graphons of the type . See Figure 2. Then
[TABLE]
For , we have
[TABLE]
which implies that
[TABLE]
[TABLE]
This yields
[TABLE]
Using (32) and (41) for and , (45) gives . After first order perturbation as in (31), the remainder terms are bounded by
[TABLE]
The following lemma is useful for our asymptotic derivation.
Lemma III.1**.**
Let be defined as in (12). As ,
[TABLE]
and as ,
[TABLE]
Proof.
We recognize that after simplification, the left hand side of (47) becomes , and the left hand side of (48) becomes . The rest is immediate. ∎
Applying Lemma III.1 to the first order perturbation terms, we see that conforming to our heuristic analysis in Sections II.3 and II.4, the remainder terms are indeed of negligible order. For large enough, we have
[TABLE]
[TABLE]
Since has the smallest absolute value among all equations in (41), both in terms of first order perturbation (49) and the exact value.
Denote by and the exact optimal value of within the (internal) tangent cone and , respectively. Let us compare with . From our heuristic argument, for any and large enough, (34) is satisfied:
[TABLE]
Following similar reasoning, we also find a bound in the other direction:
[TABLE]
Maximizing each first variation term and applying Lemma III.1 as previously, this says that is asymptotically bounded below by
[TABLE]
and above by
[TABLE]
Similar analysis works for and we skip the details. Since can be taken arbitrarily small and has the smallest absolute value among all equations in (41), . As discussed earlier in Section I, by Theorems 3.1 and 3.2 in Chatterjee and Diaconis CD2013 , we conclude that when diverges along the critical direction , a typical sampled graph more likely resembles a Turán graphon with classes than with classes. See Table 1 and Figure 3.
Theorem III.2**.**
Consider the edge-triangle exponential random graph model, obtained by setting in (10) an edge and a triangle. For , let , where is the critical direction (38) and is sufficiently large. Then in the large limit, a typical graph drawn from the model behaves like a Turán graphon with classes,
[TABLE]
III.2 Geometric interpretation
We proceed further and examine the effect of infinitesimal perturbation on the associated edge and triangle densities. From (42) and (44),
[TABLE]
[TABLE]
Let us present another perspective on this calculation incorporating assumption (17), which may be employed to derive infinitesimal variations for more complicated homomorphism densities. The idea appeared in our heuristic analysis before and we make it explicit here. For any so that (17) is satisfied,
[TABLE]
[TABLE]
Using , and (40), we recover the partial derivatives calculated above. In particular, we recognize that points along the left tangent line and points along the right tangent line, while the critical direction is the normal vector to the line segment that connects neighboring vertices and . This offers a geometric justification of the () signs in (41).
Recall that the lower boundary of attainable edge-triangle densities is a piecewise algebraic curve with infinitely many concave pieces (35), and the connection point of and is . See Figure 1. We compute
[TABLE]
which implies that the left and right hand side derivatives at are respectively given by
[TABLE]
The partial derivative vectors in (55) thus delineate the boundary of the (internal) tangent cone . In other words, spans all possible infinitesimal variations at the Turán graphon with classes. See Figure 2. Since the optimizing graphon associated with large enough must lie within the tangent cone of some Turán graphon, it may be represented by a linear combination of Erdős-Rényi and Turán graphons. Even though the graphon representation may not be unique, optimizing over all possible combinations provides insight into the structure of the maximizing set. Keeping track of the Erdős-Rényi and Turán characteristics in the edge-triangle model is thus not only an effective but also sufficient method to estimate the normalization constant, and gives evidence of discontinuity of the natural parametrization along the critical directions in the limit as and then tend to infinity. This demonstrates the occurrence of discontinuous phase transitions in the edge-triangle model.
Acknowledgements
The authors are very grateful to the anonymous referee for the invaluable suggestions that greatly improved the quality of this paper. Mei Yin thanks Sukhada Fadnavis for helpful conversations. Rajinder Mavi was supported by a postdoctoral fellowship from the Michigan State University Institute for Theoretical and Mathematical Physics. Mei Yin’s research was partially supported by NSF grant DMS-1308333.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1(1) Ahlswede, R., Katona, G.O.H.: Graphs with maximal number of adjacent pairs of edges. Acta Math. Acad. Sci. Hungar. 32, 97-120 (1978)
- 2(2) Aldous, D.: Representations for partially exchangeable arrays of random variables. J. Multivariate Anal. 11, 581-598 (1981)
- 3(3) Aldous, D., Lyons, R.: Processes on unimodular random networks. Electron. J. Probab. 12, 1454-1508 (2007)
- 4(4) Aldous, D., Steele, J.M.: The objective method: Probabilistic combinatorial optimization and local weak convergence. In: Kesten, H. (ed.) Probability on Discrete Structures, pp. 1-72. Springer, Berlin (2004)
- 5(5) Aristoff, D., Zhu, L.: Asymptotic structure and singularities in constrained directed graphs. Stochastic Process. Appl. 125, 4154-4177 (2015)
- 6(6) Benjamini, I., Schramm, O.: Recurrence of distributional limits of finite planar graphs. Electron. J. Probab. 6, 1-13 (2001)
- 7(7) Borgs, C., Chayes, J., Cohn, H., Zhao, Y.: An L p superscript 𝐿 𝑝 L^{p} theory of sparse graph convergence I. Limits, sparse random graph models, and power law distributions. ar Xiv preprint, ar Xiv: 1401.2906 (2014)
- 8(8) Borgs, C., Chayes, J., Cohn, H., Zhao, Y.: An L p superscript 𝐿 𝑝 L^{p} theory of sparse graph convergence II. LD convergence, quotients, and right convergence. ar Xiv preprint, ar Xiv: 1408.0744 (2014)
