Threshold functions for small subgraphs: an analytic approach
Gwendal Collet, \'Elie de Panafieu, Dani\`ele Gardy, Bernhard, Gittenberger, Vlady Ravelomanana

TL;DR
This paper introduces an analytic combinatorics approach to count small subgraphs in random graphs, considering degree constraints and overlaps, extending previous work with detailed proofs.
Contribution
It develops a new analytic method using patchwork techniques to accurately count subgraphs in constrained random graphs, providing rigorous proofs.
Findings
Effective counting method for small subgraphs
Handles degree constraints in random graphs
Provides detailed proofs and extended analysis
Abstract
We revisit the problem of counting the number of copies of a fixed graph in a random graph or multigraph, including the case of constrained degrees. Our approach relies heavily on analytic combinatorics and on the notion of patchwork to describe the possible overlapping of copies. This paper is a version, extended to include proofs, of the paper with the same title to be presented at the Eurocomb 2017 meeting.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Threshold functions for small subgraphs: an analytic approach
Gwendal Collet111Institute of Discrete Mathematics and Geometry, TU Wien (Austria). Partially supported by the Austrian Science Foundation FWF, grant SFB F50-02. , Élie de Panafieu222Nokia Bell Labs and LINCS (France). This work was partially founded by the Austrian Science Fund (FWF) grant F5004, the Amadeus program and the PEPS HYDrATA. , Danièle Gardy333DAVID Laboratory, University of Versailles Saint Quentin (France). Partially supported by the Amadeus project 33697ZK Threshold problems and phase transitions in graph-like structures (2015–16), by the PICS project Constraint analysis through analytic combinatorics (2017–19), and by the ANR-MOST MetaConc (2015–19). ,
Bernhard Gittenberger444Institute of Discrete Mathematics and Geometry, TU Wien, Wiedner Hauptstr. 8–10/104, 1040 Wien, Austria. Partially supported by the Austrian Science Foundation FWF, grant SFB F50-03 and the ÖAD grant Amadée F01/2015. , Vlady Ravelomanana555IRIF, University of Paris 7 (France). Partially supported by the Amadeus project 33697ZK Threshold problems and phase transitions in graph-like structures (2015–16), by the project Combinatorics in Paris (2014–17) and by the PICS project Constraint analysis through analytic combinatorics (2017–19).
Abstract
We revisit the problem of counting the number of copies of a fixed graph in a random graph or multigraph, including the case of constrained degrees. Our approach relies heavily on analytic combinatorics and on the notion of patchwork to describe the possible overlapping of copies.
This paper is a version, extended to include proofs, of the paper with the same title to be presented at the Eurocomb 2017 meeting.
Keywords. random graphs, subgraphs, analytic combinatorics, generating functions.
1 Introduction
Since the introduction of the random graph models and by Erdős and Rényi [8] in 1960, one of the most studied parameters is the number of subgraphs isomorphic to a given graph . By the asymptotic equivalence between and , results from one model can be rigorously translated into the other one. Erdős and Rényi derived the threshold for when is a strictly balanced graph (see definition next page), and Bollobás [4] generalized their result to any graph . Ruciński [14] proved that is asymptotically normal beyond the threshold, and follows a Poisson law at the threshold iff is strictly balanced. Then Janson, Oleszkiewicz and Ruciński [11] developed a moment-based method for estimating . The notion of strongly balanced graphs, introduced by Ruciński and Vince in [15], plays a key role in obtaining the results mentioned above.
Recently, there has been an increasing interest in the study of constrained random graphs, such as given degree sequences or regular graphs; the number of given subgraphs in such structures has been also studied. E.g., Wormald [18] proved that the number of short cycles in these structures asymptotically follows a Poisson distribution; using a multi-dimensional saddle-point approach, McKay [13] studied the structure of a random graph with given degree sequence, including the probability of a given subgraph or induced subgraph.
Our goal is to revisit (part of) these results through analytic combinatorics and extensive use of generating functions (g.f.). Ours is not the first paper that approaches graph problems with these tools. Early such work was by McKay and Wormald (see, e.g., [12] for the enumeration of graphs with a specified degree sequence); an important development was the study of planar graphs by Giménez and Noy [10], followed by several papers in the same direction; see also a recent paper by Drmota, Ramos and Rué [7] about the limiting distribution of the number of copies of a subgraph in subcritical graphs.
In the rest of this section, we give formal definitions of our model and the objects we are interested in. Then we address the problem of evaluating the number of subgraphs in Section 2; finally some of those results are extended to graphs and multigraphs with degree constraints in Section 3.
In the rest of this section, we give formal definitions of our model and the objects we are interested in. Then we address the problem of evaluating the number of subgraphs, applying analytic combinatorics tools, i.e. generating function manipulations, in Section 2. Finally some of those results are extended to graphs and multigraphs with degree constraints in Section 3.
Model and definitions.
Most of the following definitions come from [8] and [4]. A graph is a pair , where denotes the set of labeled vertices, and the set of edges. Each edge is a unoriented pair of distinct vertices, thus loops and multiple edges are forbidden. An -graph is a graph with vertices, labeled from to , and edges. A graph is a subgraph of if and . We then write . Two graphs , are isomorphic if there exists a bijection from to that induces a bijection between and . An -graph is a graph isomorphic to , an -subgraph of is a subgraph of that is an -graph, and denotes the number of subgraphs of that are -graphs. Given a graph family , an -graph is an -graph for some . The density of a graph is the ratio between its number of edges and of vertices. A graph is strictly balanced if its density is larger than the density of its strict subgraphs. The essential density of is the highest density of its subgraphs
[TABLE]
To any graph family , we associate the generating function
[TABLE]
where denotes the number of -graphs isomorphic to a graph from .
2 Number of subgraphs in a random graph
Graphs with one distinguished subgraph.
Theorem 1**.**
The number of -graphs where one -subgraph is distinguished is
[TABLE]
where the asymptotics holds when is an entire function, tends to infinity with while , and converges uniformly on any compact set to an analytic function.
Proof.
A graph on vertices where one -subgraph is distinguished is a copy of a graph from , a set of additional vertices, and a set of additional edges. Those edges can link any pair from the vertices, except the pairs already linked in . The Symbolic Method (see [9]) translates this combinatorial description into the generating function expression of the theorem. The asymptotics is then extracted using a saddle-point method. ∎
Let denote a densest subgraph of . Theorem 1 is now applied with equal to the family of the -graphs. Dividing both sides of Equation (1) by the total number of -graphs, we obtain a new proof for the following classic result of [8, 4].
Corollary 1**.**
Denote by and the number of edges and density of a densest subgraph of , and consider a random -graph with for some fixed , then
[TABLE]
Thus, for any , almost surely.
Graphs with marked subgraphs.
Given a graph , an -patchwork is a set of distinct -graphs that might share vertices and edges, and such that the pair is a graph, denoted by . This notion is illustrated in Figure 1.
Let denote the number of -patchworks that are composed of -graphs, and such that is an -graph. Then the generating function of -patchworks is defined as
[TABLE]
Theorem 2**.**
The number of -graphs that contain exactly -subgraphs is
[TABLE]
Proof.
We introduce the generating function of -graphs, where the variable marks the total number of -subgraphs
[TABLE]
and apply the following inclusion-exclusion argument. is the generating function of -graphs where each -subgraph is either marked, or left unmarked. By definition, the set of marked -subgraphs is a patchwork, which is distinguished in the graph. Thus, we can apply Theorem 1, where is replaced by
[TABLE]
We then replace with and extract the coefficient . ∎
For a general graph , we do not have an explicit expression for the generating function of -patchworks. However, partial information is enough to address some interesting problems. The following theorem was first derived by [4].
Theorem 3**.**
Let denote a strictly balanced graph of density , with edges and automorphisms, and assume for some positive constant . Then the number of -subgraphs in a random -graph follows a Poisson limit law of parameter , i.e. for any nonnegative integer ,
[TABLE]
Proof.
As observed by [8], since is strictly balanced, any graph containing two non-disjoint -graphs has a higher essential density than . According to Corollary 1, the random graph of the theorem almost surely contains no such -subgraph. Following this intuition, one can prove that only patchworks of disjoint -graphs have a nonnegligible contribution in Equation (2). So we replace with and obtain
[TABLE]
where the asymptotics comes from Theorem 1. Dividing by the total number of -graphs and observing finishes the proof. ∎
3 Small subgraphs in graphs with degree constraints
We consider now -graphs, which are -graphs where all vertices have their degree in the set . In the following, contains at least two integers. We restrict our study to the case where goes to infinity with in such a way that has a limit in . Since the sum of the degrees is twice the number of edges, if reaches one of those bounds, the corresponding -graphs are regular (a case already treated in the literature), while if is outside the interval, there exist no -graphs. Finally, to shorten the theorems, we assume . The generating function of the set is , and we define as the unique positive solution (see Note IV.46 of [9]) of
[TABLE]
As observed by [1, 16, 3, 6], multigraphs are easier to analyze than graphs when considering degree constraints. A multigraph is a pair where denotes the set of labeled vertices, and the set of labeled oriented edges. Each edge is thus an oriented pair of vertices, and loops and multiple edges are allowed. The definitions on graphs are extended naturally to multigraphs. Given a multigraph family , let denote the number of -multigraphs with vertices of degree , for all , that are isomorphic to some multigraph from . We associate to the family the generating function
[TABLE]
Theorem 4**.**
With the previous notations and conventions, given a multigraph family , the number of -multigraphs where one -subgraph is distinguished is
[TABLE]
If converges uniformly on any compact set to an analytic function, and denotes the total number of -multigraphs, then the asymptotics of Equation (3) is
[TABLE]
Proof.
This result is obtained as a combination of the proof of Theorem 1 and the work of [6]. In particular, those authors derived an expression for the number of -multigraphs
[TABLE]
For the asymptotics, the sum is rewritten as an integral
[TABLE]
and a saddle-point method is applied. ∎
A direct consequence of the previous theorem is the counterpart of Corollary 1.
Corollary 2**.**
Denote by and the number of edges and density of a densest subgraph of the multigraph , and consider a random -multigraph then
[TABLE]
As stated at the beginning of this section, we consider random -multigraphs with a number of edges that grows linearly with the number of vertices. In that case, as a finite positive limit. Thus, the condition of the following theorem is satisfied only when is a cycle. However, in a future extension of this work, we plan to consider the case where goes to infinity (when is infinite). In this more general setting, other subgraphs than cycles will appear, but the condition should remain as stated here.
Theorem 5**.**
Let denote a strictly balanced -multigraph with automorphisms. Assuming that goes to infinity with in such a way that
[TABLE]
has a positive limit, denoted by , then the number of -subgraphs in a random -multigraph follows a Poisson limit law of parameter .
Proof.
The generating function of -multigraphs is
[TABLE]
As in the proof of Theorem 2, we replace, in Equation (3), the generating function of the multigraph family with the generating function of -patchworks. For the same reason as in Theorem 3, this generating function is then approximated by . Thus, the asymptotic number of -multigraphs with exactly -subgraphs is
[TABLE]
Its limit is extracted using the second part of Theorem 4. ∎
There are ways to orient and label the edges of a graph with edges. Thus, each graph matches multigraphs. Conversely, consider a multigraph family , stable by multigraph automorphisms, where each multigraph has edges, and contains neither loops nor multiple edges. Then can be partitioned into sets of sizes , each corresponding to a graph. Thus, as proven by [6], counting graphs with degree constraints can be achieved by removing loops and double edges from multigraphs with degree constraints. The following theorem describes the small subgraphs of -graphs, when . Its has been derived in the particular case of regular graphs by [2] and [17], and of graphs with degrees or by [5].
Theorem 6**.**
Consider a random -graph that satisfies the conditions stated at the beginning of the section.
* Any connected graph that is neither a tree nor a unicycle is asymptotically almost surely not a subgraph of .*
* Denoting by a cycle of length , and with a fixed integer , then are asymptotically independent Poisson random variables of means*
[TABLE]
Appendix A Analytic combinatorics
Symbolic method.
The book of [9], available online, provides an excellent introduction to the techniques of analytic combinatorics. The main idea is to associate to any combinatorial family of labeled objects a generating function
[TABLE]
where denotes the number of objects of size in . In the present article, to express the generating function of interesting combinatorial families, we apply the following dictionary to translate combinatorial relations between the families into analytic equations on their generating functions
- •
Disjoint union. If , and , then .
- •
Relabeled Cartesian product. In the relabeled Cartesian product , we consider all relabellings of the pairs , with and , so that each label, from to the sum of the sizes of and , appears exactly once. We then have
[TABLE]
- •
Set. A set of objects from has generating function .
Laplace and saddle-point methods.
We use in our proofs a simple case of the Laplace method (see e.g. the book Analytic Combinatorics in Several Variables of Pemantle and Wilson, 2013).
Lemma 1**.**
Consider two entire functions and , where is a positive function that reaches its unique maximum at a point , and . Then on any open interval (finite or infinite) that contains , we have
[TABLE]
whenever the integral is well defined.
The saddle-point method is a technique to compute the asymptotics of the coefficients of a generating function. The coefficient extraction is written as a Cauchy integral, on which a Laplace method is applied. There exist many variations of this technique. We will use here the following lemma, which is a particular case of Theorem VIII.8 from [9].
Lemma 2**.**
Consider two entire functions and , and a sequence of integers such that has a positive finite limit . Assume there exists a positive solution to the equation
[TABLE]
such that and . Then
[TABLE]
Appendix B Proof of Theorem 1
Exact expression.
Let us first prove the theorem for a family that is composed of the graphs isomorphic to some -graph , that has a number of automorphisms. Then the number of -graphs is , and the generating function of is
[TABLE]
A graph with vertices and one -subgraph distinguished can be decomposed as an -graph and a set of isolated vertices, plus some edges. Applying the Symbolic Method, the generating function of an -graph and a set of isolated vertices is
[TABLE]
If we assume that there are vertices, we extract the coefficient in and obtain
[TABLE]
Then each pair of the vertices can be linked by an edge, except the pairs already linked in the -graph. Thus, the number of edges that can be added is . For each of those, we decide either to add it, or to not add it. Thus, the generating function of graphs on vertices with a distinguished -graph, additional vertices, and additional edges, is
[TABLE]
Replacing by its expression, we obtain
[TABLE]
Finally, we fixe the number of edges to , and obtain for the number of -graphs where one -graph is distinguished the formula
[TABLE]
Now if is a general graph family, its generating function is a sum, for each graph that has at least one isomorphic copy in , of the generating function of the -graphs
[TABLE]
By linearity, we obtain for the number of -graph where one -graph is distinguished the formula
[TABLE]
Asymptotics.
We now apply a bivariate saddle-point method to extract the asymptotics (see [9]). In the previous expression, we apply the changes of variables
[TABLE]
and obtain
[TABLE]
Since the function
[TABLE]
converges uniformly to an analytic function , we have . Furthermore, the function
[TABLE]
converges uniformly to as well, because . Thus, there exists a sequence of analytic functions converging uniformly to [math] such that
[TABLE]
The expression of the number of -graphs with one -subgraph distinguished becomes
[TABLE]
By an argument, there exists a sequence of analytic functions converging uniformly to [math] such that
[TABLE]
so the expression becomes
[TABLE]
According to the saddle-point method, the asymptotics is the same as the asymptotics of
[TABLE]
Appendix C Proof of Corollary 1
Let denote a random -graph. If is a subgraph of , then contains only if it contains , so
[TABLE]
Assume that is a densest subgraph of , with vertices, edges, and a group of symmetries of size . Then there exist -graphs, so the generating function of the -graphs is
[TABLE]
The expected number of -subgraphs in a random -graph is equal to the number of -graphs where one -subgraph is distinguished, divided by the total number of -graphs. Applying Theorem 1, we obtain, for with ,
[TABLE]
Appendix D Properties of
Consider the random variable that takes the value with probability proportional to , for each from . Then
[TABLE]
Thus, the function is strictly increasing, and it maps to . Conversely, when has a limit in , then , defined implicitly by the relation
[TABLE]
has a positive limit.
Appendix E Proof of Theorem 4
First part of the theorem.
For completeness, we start by recalling a result from [6]. Each edge of a multigraph is oriented and labeled. Thus, is can be represented as a triplet , where and denote the two linked vertices, and the label of the edge. The edge can then be cut into two labeled half-edges, the first one hanging from and wearing the label , the second one hanging from and wearing the label . Cutting all the edges of a multigraph into half-edges, we obtain a bijection between the -multigraphs, and the sets of labeled vertices, each coming with a set of size in of labeled half-edges, such that the total number of half-edges is . Thus the number of -multigraphs is
[TABLE]
We now combine this half-edges construction with the proof of Theorem 1. Let denote a multigraph from , with vertices and edges, and assume its automorphism group (both on vertices and edges) has size . Then the number of -multigraphs is , and the generating function of the -multigraphs is
[TABLE]
An -multigraph where an -subgraph is distinguished can be uniquely decomposed as an -multigraph, a set of additional vertices, and a set of labeled half-edges, each linked to a vertex, and so that the number of half-edges and edges on each vertex is an integer from the set . Hence, the number of half-edges attached to one of the additional vertices is in , while the number of half-edges attached to a vertex of degree from the -multigraph is in the set , shifted by . The generating function of multigraphs with degrees in , where one -subgraph is distinguished, is then
[TABLE]
Using the decomposition
[TABLE]
and extracting the coefficient concludes the proof of the first part of the theorem.
Asymptotics.
Using the classic formula
[TABLE]
we rewrite the sum of the previous expression as an integral
[TABLE]
where switching the sum and the integral is licit because we are working with entire analytic functions. To obtain the number of -multigraphs where one -subgraph is distinguished, we extract the coefficient from the previous expression and multiply by
[TABLE]
To simplify the saddle-point method, we apply successively the following changes of variables
[TABLE]
The expression becomes
[TABLE]
The result follows from the saddle-point and Laplace methods.
Appendix F Proof of Theorem 6
Second point.
The generating function of cycles (graphs or multigraphs) of length is
[TABLE]
Observe that cycles of length are loops, and cycles of length are double edges. Corollary 2 proves that in asymptotically almost all -multigraphs, the cycles of length at most are disjoint. Thus, the generating function of patchworks of cycles, where the variable marks the cycles of length , can be approximated as
[TABLE]
Applying Theorem 4, the asymptotic number of -multigraphs that contain exactly cycles of length for all is
[TABLE]
Choosing , we obtain -multigraphs without loops and double edges. Each -graph containing cycles of length for all , matches exactly such multigraphs. So their asymptotic number is
[TABLE]
where
[TABLE]
is equal to the number of -graphs (are result already derived by [6]). Hence, the limit probability for a random -graph to contain exactly cycles of length , for all , is
[TABLE]
First point.
If the connected multigraph is neither a tree nor a cycle, then its essential density is greater than . Corollary 2 implies that a random -multigraph asymptotically almost surely contains no copy of . We saw in the previous paragraph that the probability, for a random -multigraph, to contain no loops or double edge has a positive limit
[TABLE]
Since almost all -multigraphs contain no -sub(multi)graph, almost all -graphs contain no -subgraph as well.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] E. A. Bender and E. Canfield. The asymptotic number of labeled graphs with given degree sequences. Journal of Combinatorial Theory, Series A , 24(3):296 – 307, 1978.
- 2[2] B. Bollobás. A probabilistic proof of an asymptotic formula for the number of labelled regular graphs. European Journal of Combinatorics , 1:311–316, 1980.
- 3[3] B. Bollobás. Random Graphs . Cambridge Studies in Advanced Mathematics, 1981.
- 4[4] B. Bollobás. Threshold functions for small subgraphs. Mathematical Proceedings of the Cambridge Philosophical Society , 9-:197–206, 1981.
- 5[5] N. Broutin and E. de Panafieu. Limit law for number of components of fixed sizes of graphs with degree one or two. Arxiv , page 12, 2014.
- 6[6] E. de Panafieu and L. Ramos. Graphs with degree constraints. proceedings of the Meeting on Analytic Algorithmics and Combinatorics (Analco 16) , 2016.
- 7[7] M. Drmota, L. Ramos, and J. Rué. Subgraph statistics in subcritical graph classes. Preprint ar Xiv:1512.08889 , 2015. To appear in Random Structures and Algorithms (source: Michael’s page).
- 8[8] P. Erdős and A. Rényi. On the evolution of random graphs. Publication of the Mathematical Institute of the Hungarian Academy of Sciences , 5:17–61, 1960.
