Asymptotic enumeration of orientations of a graph as a function of the out-degree sequence
Mikhail Isaev, Tejas Iyer, Brendan D. McKay

TL;DR
This paper derives an asymptotic formula for counting orientations of a graph with a specified out-degree sequence, under certain degree and mixing conditions, with applications to random orientations and statistical models.
Contribution
It provides the first asymptotic enumeration formula for graph orientations with given out-degree sequences under broad conditions.
Findings
Established asymptotic counts for orientations with specified out-degree sequences.
Derived new bounds for maximum likelihood estimators in the Bradley-Terry model.
Applied enumeration results to analyze subdigraph occurrences in random orientations.
Abstract
We prove an asymptotic formula for the number of orientations with given out-degree (score) sequence for a graph . The graph is assumed to have average degrees at least for some , and to have strong mixing properties, while the maximum imbalance (out-degree minus in-degree) of the orientation should be not too large. Our enumeration results have applications to the study of subdigraph occurrences in random orientations with given imbalance sequence. As one step of our calculation, we obtain new bounds for the maximum likelihood estimators for the Bradley-Terry model of paired comparisons.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Asymptotic enumeration of orientations of a graph
as a function of the out-degree sequence††thanks: This research is supported by the Australian Research Council, Discovery Project DP140101519. The first author’s research is also supported by Australian Research Council Discovery Early Career Researcher Award DE200101045.
Mikhail Isaev
School of Mathematics
Monash University
Clayton, VIC 3800, Australia
Moscow Institute of Physics and Technology
Dolgoprudny, 141700, Russian Federation
Tejas Iyer
Department of Mathematics
University of Birmingham
Birmingham, UK
Brendan D. McKay
Research School of Computer Science
Australian National University
Canberra, ACT 2601, Australia
Abstract
We prove an asymptotic formula for the number of orientations with given out-degree (score) sequence for a graph . The graph is assumed to have average degrees at least for some , and to have strong mixing properties, while the maximum imbalance (out-degree minus in-degree) of the orientation should be not too large. Our enumeration results have applications to the study of subdigraph occurrences in random orientations with given imbalance sequence. As one step of our calculation, we obtain new bounds for the maximum likelihood estimators for the Bradley-Terry model of paired comparisons.
1 Introduction
Let be an undirected simple graph with vertices . An orientation of is an assignment of one of the two possible directions to each edge, thereby making an oriented graph . The imbalance (sometimes called excess) of a vertex is , and the imbalance sequence of is . If , then is called an Eulerian orientation of .
Our primary aim in this paper is to find the asymptotic number of orientations of with given imbalance sequence. In solving this enumeration problem, we will apply the saddle point method to a suitable generating function, using Cauchy’s Theorem while following the general framework outlined in [12]. In the process, we will use results from the theory of paired comparisons, uncovering an interesting link between mathematical statistics and enumerative combinatorics.
In order to apply the saddle point method to enumerate the number of orientations, we will use the standard parameters in the Bradley-Terry model of paired comparisons. This model was first studied by Zermelo in 1929 [24], and independently by Bradley and Terry [3], Ford [5], Jech [14] and many others. See, for example, Hunter [9] for a general treatment. Contestants in a competition carried out by pairwise comparisons are assumed to have “merits” such that contestant defeats contestant with probability
[TABLE]
Note that ; i.e., ties are not allowed. The statistical problem is then to estimate the merits from the scores (the number of comparisons won by each contestant), after which the merits can be taken as a measure of the strength of each contestant.
Each of the above authors noted that the maximum likelihood estimate of the merits given the scores is (up to multiplication by a constant factor, since only the ratios matter) the solution of the “balance equations”
[TABLE]
Zermelo [24] proved that (2) has a unique solution if the digraph defined by the results of each comparison is strongly connected. We generalise this in Theorem 7, using the fact, earlier noticed by Joe [15], that (2) corresponds to the point maximising a certain entropy. As a result of equation (2), the values are the radii of circles whose direct product passes through the saddle point of a generating function in -dimensional complex space; see Section 3.
If we orient each edge independently towards with probability and towards with probability , then, as we will prove in Lemma 5, the probability of a particular orientation depends only on its imbalance sequence. Because of this, it makes sense to choose so that the expected imbalances in the induced orientation equal some sequence of interest.
This gives the equations (2). Note that if satisfies (2), then so does for any constant . In the case of Eulerian orientations, a solution is , which gives for all .
A special case of our problem is enumeration of tournaments with given scores. Some of the first results go back to Spencer in 1974 [23], who gave an estimate of the number of tournaments with a given imbalance sequence. More precise results were given in [16] and [18] based on the complex-analytic approach. This technique was applied in [7] to asymptotically enumerate the number of tournaments containing a given small digraph. The method was further generalised in [10, 11] to calculate the number of Eulerian orientations for a large class of dense graphs with strong mixing properties. In this paper we extend all of the aforementioned results allowing much sparser graphs and much more variation in the imbalances of vertices.
Note that counting orientations with a given imbalance sequence of a bipartite graph corresponds to counting its subgraphs with fixed degree sequence (take all edges which go into one of the parts). Equivalently, we can count [math]– matrices with given margins where some set of entries are forced to be [math]. This question goes back to Read [22] in 1958, who derived a formula for the number of 3-regular bipartite graphs. For more recent asymptotic results, see, for example, [2, 4, 8, 17] and references therein. Our formula applied to the bipartite case significantly improves known results for this enumeration problem as well.
The Cheeger constant (or isoperimetric number) of a graph , denoted by , is defined as follows.
[TABLE]
where is the set of edges of with one end in and one end in . The number is a discrete analogue of the Cheeger isoperimetric constant in the theory of Riemannian manifolds and it has many interesting interpretations (for more detailed information see, for example, [20] and the references therein).
Let denote the identity matrix, and let denote the matrix with every entry 1; in each case of order . Define the symmetric positive-semidefinite matrix by
[TABLE]
for , and further define
[TABLE]
where and stand for the expectation and the variance of a random variable .
In the following theorem, a pair stands for a sequence of graphs and imbalance sequences parametrised by a positive integer . Statements involving and hold if is sufficiently large and is sufficiently small. Throughout the paper, the asymptotic notations have their usual meaning.
Theorem 1**.**
Let be a graph with vertices and maximum degree . Let be the imbalance sequence for some orientation of . Assume the following hold as .
- A1.
* for some constant .*
- A2.
, for some constant .
- A3.
Equations (2) have a solution such that \lower 0.6458pt\hbox{\large\textstyle\frac{r_{j}}{r_{k}}}\leqslant 1+R for , where satisfies and R^{2}\lower 0.6458pt\hbox{\large\textstyle\frac{n}{\varDelta}}\log\lower 0.6458pt\hbox{\large\textstyle\frac{2n}{\varDelta}}=o(\log n).
Adopt all the definitions in (4). Then the number of orientations of with imbalance sequence is
[TABLE]
Note that by assumption A3 so the error terms in (5) are always vanishing. In the particular case of Eulerian orientations, .
The quantities and have interesting interpretations. First, is the probability of each orientation with imbalance sequence in the Bradley-Terry model, as we indicate in Lemma 5. Second, suppose each edge of is assigned weight and each spanning tree of is assigned weight equal to the product of the weights of its edges. Define to be the sum over all weights of spanning trees in . Note that the eigenvalues of are (from the term \lower 0.6458pt\hbox{\large\textstyle\frac{\varDelta}{n}}J) together with the non-zero eigenvalues of . Therefore, using the Matrix-Tree Theorem (for example, [21, Theorem 5.2]), we get
[TABLE]
The quantities , , , and defining can be calculated by inverting the matrix and using Isserlis’ formula; see Lemma 13. Their growth rates are given in the next lemma. Note that if and for all then , , are vanishing while can be explicitly approximated in terms of the degrees of the graph .
Lemma 2**.**
Let the assumptions A1, A2, A3 of Theorem 1 hold. Then,
[TABLE]
where are the degrees of .
For the case when , we solve (2) by setting . Thus, Theorem 1 and Lemma 2 immediately give an asymptotic formula for the number of Eulerian orientations. This formula was previously known only for the dense range ; see [11].
Corollary 3**.**
Let be a graph with even degrees , satisfying assumptions A1 and A2 of Theorem 1. Then the number of Eulerian orientations of is
[TABLE]
where is the number of (unweighted) spanning trees.
We prove Theorem 1 and Lemma 2 in Section 3.3. Applications of these results include estimating the probability for a uniform random orientation with given imbalance sequence to contain a prescribed subdigraph. For example, one might be interested in estimating the chance that a team has defeated both teams and in a tournament given the scores of all the teams. We give a simple demonstration of such an application in Section 4 (for Eulerian orientations).
In Section 2 we study equations (2). We provide necessary and sufficient conditions for the existence and the uniqueness (up to scaling) of the solution and find an explicit bound on the ratios . In particular we obtain a simple sufficient condition for assumption A3 of Theorem 1 to hold, stated below.
Theorem 4**.**
Adopt assumptions A1 and A2 of Theorem 1. If
[TABLE]
then assumption A3 of Theorem 1 holds with R=O\Bigl{(}\lower 0.6458pt\hbox{\large\textstyle\frac{\mathopen{|}{\boldsymbol{b}}\mathclose{|}_{\infty}}{\varDelta}}\log\lower 0.6458pt\hbox{\large\textstyle\frac{2n}{\varDelta}}\Bigr{)}.
Throughout the paper stands for the standard vector norm or for the corresponding induced matrix norm. The proof of Theorem 4 is given at the end of Section 2.
2 The Bradley–Terry model of orientations
In this section we explore the existence and nature of solutions to the balance equations (2). Except in the proof of Theorem 4, we do not require assumptions A1–A3 in this section. Some of the techniques used in this section follow those of Barvinok and Hartigan [2].
Consider a graph and for each edge choose numbers with and . Now independently orient each edge towards with probability and towards with probability . We call this a random orientation of with parameters . It is degenerate if some equals 0 or 1. It is conditionally uniform if, for every orientation of , all the orientations of with the same imbalances as have the same probability.
Lemma 5**.**
A non-degenerate random orientation of with parameters is conditionally uniform if and only if there exists such that for all , where are given by (1).
Proof.
Let be the imbalance sequence of an orientation . Then, for a random orientation with parameters , occurs with probability (whether or not (2) holds). This proves uniformity.
Conversely, suppose that the non-degenerate random orientation with parameters is conditionally uniform. Assume that is connected (otherwise, apply the following argument to each component).
Take a spanning tree , and assign a number to each vertex as follows. First, . Then, for , let be the unique path from to in . Define r_{j}:=\prod_{t=1}^{s}\bigl{(}(1-p_{v_{t-1}v_{t}})/p_{v_{t-1}v_{t}}\bigr{)}. Then, using this to define the parameters , we can now check that for . Consider an edge and let be the unique cycle in that contains and otherwise only edges of . Let be any orientation of in which this cycle is a directed cycle. Since reversing the edges on the cycle gives the same imbalance sequence as , uniformity implies that . Then, by the definition of , we get that This implies that , and the proof is complete. ∎
Lemma 6**.**
A sequence is an expected imbalance sequence of some random orientation of if and only if and
[TABLE]
In addition, is the expected imbalance sequence of some non-degenerate random orientation if and only if (6) holds and is strict for every that is not a union of connected components of .
Proof.
In order to prove the lemma, we consider an equivalent network flow problem, and apply the max-flow min-cut theorem of Ford and Fulkerson [6]. To this end, given we define an auxiliary flow network with source and sink , such that and . The capacity function is then defined such that, for , , , and all other capacities are [math]. Note that every cut in the network has the form for some . The capacity of this cut is
[TABLE]
where we have used and . By (7) and the max-flow min-cut theorem ([6], Theorem 1), there is a flow of value iff (6) holds. Such a flow saturates all the edges incident to or , so from each vertex , the net flow on the arcs between and other vertices in is , that is
[TABLE]
where is the set of neighbours of in . Now, for , define by
[TABLE]
Note that for any and, by (8), the random orientation with parameters has expected imbalance sequence . This proves the first equivalence.
For the second part, suppose that is such that (6) holds and is strict for any such that ; that is, it is not a union of connected components of . Then, there is some with such that
[TABLE]
for all , where {\boldsymbol{b}}^{\prime}:=\lower 0.6458pt\hbox{\large\textstyle\frac{1}{1-2\varepsilon}}{\boldsymbol{b}}. By the first part of this lemma, there exists a (possibly degenerate) random orientation of with parameters and expected imbalance sequence . Now define by for , and note that we still have and . That is, are non-degenerate parameters with expected imbalance sequence .
Conversely, note that any random orientation of with parameters induces a maximum flow on the network, by setting , and assuming the flow is at maximum capacity on arcs incident to or . But, now, if equality occurs in (6) for some that , then the cut is saturated by any flow of value , so the edges crossing it must have flow in one direction and [math] in the other. In particular, this implies that the probabilities corresponding to flows on arcs across the cut must be degenerate. ∎
Theorem 7**.**
Let be such that and
[TABLE]
with the inequality being strict for any that is not the union of connected components of . Then there exists , unique up to uniform scaling in each connected component of , such that the random orientation of with parameters given by (1) has expected imbalance sequence .
Proof.
Consider a random orientation of with parameters . We view these parameters as a vector , and let be the set of possible directed edges in an orientation of . Then, since the edges of are oriented independently, the entropy function corresponding to this orientation is given by
[TABLE]
with the usual convention that the terms corresponding to are [math]. This is a continuous function on a compact set, thus there exists a maximiser .
Next, we show by contradiction that is non-degenerate. Assume otherwise. Note that by Lemma 6, there exists a non-degenerate for which the expected imbalance sequence is . Let be the set of directed edges such that . Then, for ,
[TABLE]
Using the strict concavity of the function on , we get
[TABLE]
Using the fact that , this yields the lower bound
[TABLE]
Now, for sufficiently small, the bracketed term on the right can be made negative, which implies , a contradiction.
Denoting Lagrange multipliers by , define
[TABLE]
and consider this is a function of variables for , where one of and is arbitrarily chosen and the other is determined by . The partial derivatives satisfy
[TABLE]
By setting these partial derivatives to [math], we find that the maximiser satisfies
[TABLE]
so that if we set for , the corresponding random orientation has parameters as defined by (1). Moreover, by the strict concavity of the entropy function, on the convex, compact set corresponding to the equality constraints, the maximiser is unique. This implies by (9) that for the ratios are unique, so that the are unique up to uniform scaling in every connected component of . ∎
Lemma 8**.**
Let be a connected graph of maximum degree . Let and be such that and
[TABLE]
Then, for , the solution of the system (2) is such that, for all and ,
[TABLE]
We defer the proof of Lemma 8 until Section 5.1.
Proof of Theorem 4.
Since , we have for any that
[TABLE]
By assumptions, we can bound
[TABLE]
Applying Lemma 8 with \delta=1-\lower 0.6458pt\hbox{\large\textstyle\frac{\mathopen{|}{\boldsymbol{b}}\mathclose{|}_{\infty}}{h(G)}}, we find that
[TABLE]
Thus, we get that \lower 0.6458pt\hbox{\large\textstyle\frac{r_{j}}{r_{k}}}=1+o(1) and so
[TABLE]
This completes the proof of that assumption 3 holds. ∎
3 Enumeration
The Laplacian matrix of is the symmetric matrix given by the diagonal matrix of degrees minus the adjacency matrix of . Since the row sums of this matrix are zero, it has a zero eigenvalue corresponding to an eigenvector with all components equal. The next smallest eigenvalue, , is called the algebraic connectivity of and is closely related to the Cheeger constant.
Lemma 9** ([20]).**
For any graph , we have
[TABLE]
Lemma 10**.**
Under assumptions A1–A3, the following are true.
- (a)
The minimum degree of is at least .
- (b)
\lambda_{2}(G)\geqslant\bigl{(}1-(1-\gamma^{2})^{1/2}\bigr{)}\varDelta\geqslant\frac{1}{2}\gamma^{2}\varDelta.
- (c)
For , \lower 0.6458pt\hbox{\large\textstyle\frac{1+R}{(2+R)^{2}}}\leqslant\lambda_{jk}\lambda_{kj}\leqslant\lower 0.6458pt\hbox{\large\textstyle\frac{1}{4}} and \lvert\lambda_{jk}-\lambda_{kj}\rvert\leqslant\lower 0.6458pt\hbox{\large\textstyle\frac{R}{2+R}}=O(R).
Proof.
Part (a) follows from the trivial fact that cannot be larger than the minimum degree. Part (b) follows from Lemma 9. Part (c) is a simple consequence of A3. ∎
Let be the number of orientations of with imbalance sequence . By Cauchy’s integral formula, using the generating function \prod_{jk\in G}\Bigl{(}\lower 0.6458pt\hbox{\large\textstyle\frac{x_{j}}{x_{k}}}+\lower 0.6458pt\hbox{\large\textstyle\frac{x_{k}}{x_{j}}}\Bigr{)}, we have
[TABLE]
where the contours circle the origin once anticlockwise. We choose the circles as contours, so that
[TABLE]
Given , define
[TABLE]
It is easily seen that is a seminorm on that induces a norm on , the real numbers modulo . An interval of of length is a set of the form
[TABLE]
We will also write as when it is not ambiguous.
Next, note that any individual value can be replaced by without changing , since in every orientation the imbalance of a vertex has the same parity as its degree in . This means we can write
[TABLE]
We will approach (12) by splitting the region of integration in several parts. Let
[TABLE]
In other words, the region consists of those such that all components can be covered by an interval of of length at most . It will turn out that will dominate , and that in the complement of even the integral of is negligible.
3.1 The integral inside
We are going to apply the techniques developed in [12]. For any , define . The assumptions of Theorem 1 hold throughout this section.
First note that, since , we can uniformly translate each without changing . Also,
[TABLE]
Therefore, if we define , we have an -dimensional integral:
[TABLE]
for some region with .
Next we lift the integral back to full dimension using [12, Lemma 4.6], which we quote for convenience as Lemma 31. Let be the matrix with 1 in the last column and 0 elsewhere. Define:
[TABLE]
One can easily check that , and also that , has dimension 1 and . We also have , , , , and . Now applying [12, Lemma 4.6], and the fact that is invariant under translating each coordinate, we have
[TABLE]
where is a region such that and
[TABLE]
Lemma 11**.**
For , we have
[TABLE]
Proof.
Note that the definitions of in (11) and in (1) imply that
[TABLE]
By Taylor’s Theorem and Lemma 10, for , we have
[TABLE]
Summing over , and subtracting , we find that the linear term cancels because of (2) and the error term is as stated because of Lemma 10(c). ∎
Lemma 12**.**
Consider the symmetric positive-definite matrix defined in (4). Then the following are true.
- (a)
\mathopen{\|}A^{-1}\mathclose{\|}_{\infty}=O\bigl{(}\varDelta^{-1}\log\lower 0.6458pt\hbox{\large\textstyle\frac{2n}{\varDelta}}\bigr{)}.
- (b)
If , then and a_{jk}=O\bigl{(}\varDelta^{-2}\log\lower 0.6458pt\hbox{\large\textstyle\frac{2n}{\varDelta}}\bigr{)} uniformly for
.
- (c)
There exists a symmetric positive-definite matrix such that . Moreover, and .
Proof.
Part (a) follows from assumption A2 and Lemmas 10 and 29. To prove Part(b), let be the diagonal of . We have , so the maximum absolute value of an entry of is bounded by times the maximum absolute value of an entry of . The claim thus follows from Part (a). Both bounds in Part (c) come from Corollary 28 when we take and note that \bigl{|}\binom{-1/2}{k}\bigr{|}<k^{-1/2} and \bigl{|}\binom{1/2}{k}\bigr{|}<k^{-3/2} for . ∎
We will also use the following simple applications of Isserlis’ formula [13].
Lemma 13**.**
Let and be normal random variables with zero mean. For integer , let be the number of ways to divide things into pairs (i.e., 0 for odd and for even ). Then, for integers ,
- (a)
.
- (b)
.∎
Let be a random vector with normal density . The covariance matrix of is . For , define . Then the vector also has a normal density with zero mean; let denote its covariance matrix.
Lemma 14**.**
We have the following.
- (a)
For ,
[TABLE]
- (b)
\mathopen{\|}\varSigma\mathclose{\|}_{\infty}=O\bigl{(}\log\lower 0.6458pt\hbox{\large\textstyle\frac{2n}{\varDelta}}\bigr{)}.
- (c)
For integers and ,
[TABLE]
- (d)
For integers and ,
[TABLE]
Proof.
Part (a) follows from Lemma 12(b). For (b), note that and that there at most choices of for each . The other terms are similar, so the result follows on applying Lemma 12(a).
Part (c) follows from Part (a) and Lemma 13(a). We use Lemma 13(b) for Part (d): bound all variances and covariances except by (on account of Part (a)) and then using Part (b) to bound the sum of these terms over . ∎
Define , , and .
Lemma 15**.**
We have
[TABLE]
Proof.
We will apply [12, Theorem 4.4] which, for convenience, we quote in Section 5.4 as Theorem 32.
By Lemma 12(c), there are constants such that , where and .
Next, note that . Under this condition we calculate that, uniformly over ,
[TABLE]
and conclude that Theorem 32(b) holds for (note that here we incorporate powers of into the terms).
Now take . For Theorem 32(c) we have . The required derivative bounds are
[TABLE]
so Theorem 32(c)(ii) is satisfied by .
The appearance in the error term of Theorem 32 is the main reason cannot easily be made larger. Since the coefficients of and are , we have \operatorname{Var}f_{\mathrm{im}}(\boldsymbol{X})=O\bigl{(}R^{2}\varDelta^{-1}n\log\lower 0.6458pt\hbox{\large\textstyle\frac{2n}{\varDelta}}\bigr{)}=o(\log n) by Lemma 14(d) and assumption A3. Therefore, .
The bound \zeta(\boldsymbol{X})=O\bigl{(}R\varDelta^{-5/2+19\varepsilon/24}n+\varDelta^{-3+\varepsilon/2}n\bigr{)} follows from (14). Putting everything together, the error term given by Theorem 32 has magnitude
[TABLE]
We can now see that some contributions to and are negligible. By Lemma 14, \operatorname{Cov}(f_{3}(\boldsymbol{X}),f_{5}(\boldsymbol{X}))=O\bigl{(}R^{2}\varDelta^{-2}n\log\lower 0.6458pt\hbox{\large\textstyle\frac{2n}{\varDelta}}\bigr{)}, which is less than the geometric mean of the first two terms of (15) and so is bounded by the larger of them. Similarly, \operatorname{Cov}(f_{4}(\boldsymbol{X}),f_{6}(\boldsymbol{X}))=O\bigl{(}\varDelta^{-3}n\log\lower 0.6458pt\hbox{\large\textstyle\frac{2n}{\varDelta}}\bigr{)}, and can thus be incorporated into the third term of (15). The contributions of and are even smaller.
Next, we can remove the middle term of (15) since . Finally, assumption A3 implies that . This completes the evaluation of the integral . ∎
We will also need the following bound.
Lemma 16**.**
We have
[TABLE]
Proof.
Revisiting the proof of Lemma 15, note that the difference between the integrals of and came only from and amounted to a factor of . This implies the first equality.
Observe that all of the eigenvalues of are bounded below by and bounded above by . Using Lemma 12(a), we find that . The remaining factors in the expression for in Lemma 15 are also . The bounds
[TABLE]
follow by assumption A3, applying Lemma 14. Thus, we get the second equality from the first. ∎
3.2 The integral outside
The conditions of Theorem 1 are assumed throughout this section. We begin with a few lemmas.
Lemma 17**.**
For , is a decreasing function of with and
[TABLE]
In addition, for any , we have
[TABLE]
Proof.
The first part of (16) follows from the definition of and implies that for all . Therefore we can assume that , which implies that and . Also, recall from Lemma 10(c) that for some constant . Note that, by the concavity of on , we have on this range, which in turn implies (by symmetry about the line ) that
[TABLE]
This in turn implies that for , and combining this with the inequality for all , we have \lvert f_{jk}(x)\rvert^{2}\leqslant\exp\bigl{(}-\Omega(x^{2})\bigr{)}.
Inequality (17) is trivial if , so assume that . In that case, and, since is a decreasing function of for fixed on this range
[TABLE]
Finally, by (18), we have
[TABLE]
for , which completes the proof of (17). ∎
Lemma 18**.**
Let be disjoint subsets of . Suppose such that whenever , for some . Then
[TABLE]
Proof.
Consider any of the paths provided by Lemma 30. By assumption, . Since and is a seminorm, we find that
[TABLE]
Multiplying the bound (16) over all the edges of all the paths given by Lemma 30 completes the proof. ∎
Define
[TABLE]
First, we bound the integral of in the region
[TABLE]
Lemma 19**.**
Suppose and . Let be a multisubset of such that no interval of length contains or more elements of . Then there is some interval , , such that both and contain at least elements of .
Proof.
Since the conditions and conclusion are invariant under translation, we can assume without loss of generality that is an interval with the greatest number of elements of out of all intervals of length . Since has at least elements of by assumption, satisfies the requirements of the lemma unless it contains less than elements of .
Therefore, assume that all intervals of length have less than elements of . For , let be the number of elements of that lie in . Note that is a non-decreasing step function with steps of size less than , also that and . Therefore, there is some such that . It can now be checked that satisfies the lemma. ∎
Lemma 20**.**
We have
[TABLE]
Proof.
If , the definition of implies that every interval of of length has fewer than components of . Applying Lemma 19 with , , and tells us that there exist and such that both and contain at least components of . For such , Lemma 18, with and corresponding to the indices of the elements of belonging to and respectively, tells us that \lvert F({\boldsymbol{\theta}})\rvert\leqslant\exp\bigl{(}-\Omega(1)\varDelta t^{2}n\log^{-2}n\bigr{)}=e^{-\Omega(n\log^{2}n)}. Using as a bound on the volume of , the result follows from Lemma 16. ∎
Next, we bound the integral of in the region
[TABLE]
Lemma 21**.**
We have
[TABLE]
Proof.
The volume of is only , so the bound is adequate in conjunction with Lemma 16. ∎
For disjoint define by the set of for which there exists some and with such that the following hold:
- (i)
for at least components .
- (ii)
if and only if .
- (iii)
if and only if .
Lemma 22**.**
We have
[TABLE]
where the union is over all disjoint with and .
Proof.
Any is such that at least of its components lie in some interval . Suppose it is not covered by any . For , take and let correspond to the components not in . Since (iii) cannot hold, we get
[TABLE]
Recalling that , we can apply this ratio repeatedly starting with to find that
[TABLE]
This implies that , which completes the proof. ∎
Lemma 23**.**
For any disjoint with and , we have
[TABLE]
Proof.
Let and define the map as follows. By the definition of , for any there is some interval of length at most that contains . Let be the unique shortest such interval. We can ignore parts of that lie in , which means that we can assume .
Identifying with , define
[TABLE]
For , and maps the complementary interval linearly onto (reversing and contracting with fixed). For , and .
Thus for all . From Lemma 17, we find that
[TABLE]
Moreover, for and , we get that . Observing also that and using (17), we find that
[TABLE]
By Assumption A2 of Theorem 1, this bound applies to at least pairs , thus
[TABLE]
Note that the map is injective, since can be determined from . Also, is analytic except at places where the map from to is non-analytic, which happens only when two distinct components for lie at the same endpoint of . Thus, the points of non-analyticity of lie on a finite number of hyperplanes, which contribute nothing to the integral. To complete the calculation, we need to bound the Jacobian of the transformation in the interior of a domain of analyticity.
We have
[TABLE]
Although we have not specified all the entries of the matrix, these entries show that the matrix is triangular, and hence the determinant has absolute value \bigl{(}\frac{\xi}{\pi-\xi}\bigr{)}^{|U|+|W|}, which is because . ∎
3.3 Proofs of Theorem 1 and Lemma 2
Proof of Theorem 1.
The number of orientations in terms of the integral appears in (12). That integral restricted to the region is , evaluated in Lemma 15. This gives the expression in Theorem 1 so it remains to show that the other parts of the integral fit into the error terms given there.
The integral in is bounded in Lemmas 20 and 21. The remaining parts of are bounded by the sum of Lemma 23 over disjoint with and . The number of choices of for given is less than , so the total contribution here is
[TABLE]
which is easily small enough. ∎
Proof of Lemma 2.
From Lemma 10(c), we know that . Then, applying Lemma 14, we find that \operatorname{Var}f_{3}(\boldsymbol{X})=O\bigl{(}R^{2}\varDelta^{-1}n\log\lower 0.6458pt\hbox{\large\textstyle\frac{2n}{\varDelta}}\bigr{)}, and \operatorname{Var}f_{4}(\boldsymbol{X})=O\bigl{(}\varDelta^{-2}n\log\lower 0.6458pt\hbox{\large\textstyle\frac{2n}{\varDelta}}\bigr{)}.
It remains to estimate , which Lemma 13 shows is equal to
[TABLE]
where . Let be the diagonal matrix where are diagonal elements of . Using Lemma 10(c), we get
[TABLE]
Then . Note that the entries of are uniformly , so the entries of are uniformly \mathopen{\|}A^{-1}\mathclose{\|}_{\infty}O(\varDelta^{-1})=O\bigl{(}\varDelta^{-2}\log\lower 0.6458pt\hbox{\large\textstyle\frac{2n}{\varDelta}}\bigr{)}, using Lemma 12(a). Therefore, for ,
[TABLE]
where the last equality follows from Lemma 10(a). Now it only remains to assemble these parts to obtain the lemma. ∎
4 Probability of subdigraph occurrence
Let be a spanning subgraph of , and let be an orientation of with imbalance sequence . Then
[TABLE]
is the probability that a uniform random orientation of with imbalances contains as a subdigraph. Consequently, Theorem 1 gives this probability asymptotically provided both the numerator and the denominator satisfy the conditions of that theorem. We will not explore this issue further in this paper except for the case that ; i.e., both orientations are Eulerian.
Theorem 24**.**
Let be a graph with even degrees and let be a spanning subgraph of with even degrees . Define , and assume that \varDelta^{-2}(n+m)\log\lower 0.6458pt\hbox{\large\textstyle\frac{2n}{\varDelta}}=o(1), where is the maximum degree of . Also assume that there is a constant such that . Then, for any fixed Eulerian orientation of , the probability that a random Eulerian orientation of includes is
[TABLE]
Proof.
We will evaluate (19) using Corollary 3. Note that implies , so assumption A2 is satisfied by both numerator and denominator. Furthermore, implies that for .
First, we have
[TABLE]
Next we consider the ratio , which equals the ratio , where is defined as in (4) and is the corresponding matrix for . As in the proof of Lemma 2, we have , where and with x_{jk}=O\bigl{(}\varDelta^{-2}\log\lower 0.6458pt\hbox{\large\textstyle\frac{2n}{\varDelta}}\bigr{)} for all . Also , where and with for and otherwise. We have
[TABLE]
The Frobenius norm of is defined by . By subadditivity,
[TABLE]
We have , and
[TABLE]
where the last equality follows from the theorem assumptions. Thus, . Schur’s Inequality [25, p. 50] says that , where are the eigenvalues of , so
[TABLE]
By the definition of and the above bound on the entries of , \operatorname{tr}U=O(\varDelta^{-2}m\log\lower 0.6458pt\hbox{\large\textstyle\frac{2n}{\varDelta}}). Thus,
[TABLE]
which completes the proof. ∎
Corollary 25**.**
Under the conditions of the theorem, if has hamiltonian cycles, then the expected number of directed hamiltonian cycles in a random Eulerian orientation of is
[TABLE]
5 Appendix
Here we will collect some technical lemmas that are used in the proof. This section is self-contained and does not rely on assumptions other than those stated.
5.1 Weighted graphs and proof of Lemma 8
Lemma 26**.**
Let be a connected graph of maximum degree . Suppose each edge is assigned a weight and
[TABLE]
Then, for any , there exist a set of edges such that
- (i)
* for all ;*
- (ii)
the intervals of real numbers cover ;
- (iii)
\displaystyle|\mathcal{S}|\leqslant 4+\frac{2\log\Bigl{(}\frac{n(1+\eta)}{2\eta h(G)}\Bigr{)}}{\log\Bigl{(}1+\frac{\eta h(G)}{(1+\eta)\varDelta}\Bigr{)}}.
Proof.
Consider the spanning subgraph of constructed as follows: each edge is present in if and only if . Note that, for any , we have
[TABLE]
Observing also , we get
[TABLE]
Now we will construct . By applying equation (20) for , we can start with , where and . From here we proceed recursively. Suppose we have edges covering (in the sense of (ii)), where . Applying (20) to and recalling that all vertices have degree at most , there must be at least vertices in that in have neighbours in . So there is some k\geqslant\ell\bigl{(}1+\frac{\eta h(G)}{(1+\eta)\varDelta}\bigr{)} such that for some . Adding this edge to means that we have covered . Continuing in this manner, we will have covered while has at most
[TABLE]
edges from . Finally, repeat the process starting at vertex to find a similar set of edges that cover . This completes the proof. ∎
Proof of Lemma 8.
Without loss of generality we may assume . We employ Lemma 26, where for any we take and define by
[TABLE]
Note that . Thus, by assumptions, we get . Take and consider the set constructed in Lemma 26. For , we have
[TABLE]
Also, observe that
[TABLE]
By [20, Thm. 2.2], for we have and also . Now we can calculate
[TABLE]
where
[TABLE]
In each case the bounds on the right hand side follow from the fact that the supremum occurs as and has the greatest allowed value.
Then, from property (ii) of Lemma 26 and (21), we find that
[TABLE]
where in the sum is ordered as . The result follows on applying the above numerical bounds. ∎
5.2 Matrices and norms
Lemma 27**.**
Let be a symmetric matrix with nonpositive off-diagonal elements and zero row sums. Suppose the eigenvalues of are . For any real , define the matrix by , where is the decomposition of as a sum of eigenvectors of (numbered consistently with the eigenvalues). Then
[TABLE]
Proof.
Let . The eigenvalues of are , where for each . Since for , we have
[TABLE]
where we have used the fact that . We will now find two different bounds on . First note that so . Second, the maximum eigenvalue of is , so . Combining these two bounds completes the proof. ∎
Corollary 28**.**
For , consider the positive-definite matrix , where satisfies the conditions of Lemma 27 with . Then, for any real , the positive-definite power satisfies
[TABLE]
where .
Proof.
Since has the same eigenvectors as , and the same eigenvalues except that [math] has been replaced by , we have
[TABLE]
Now we can apply the Lemma in the obvious way, using for and \bigl{|}\binom{\alpha}{k}\bigr{|}\leqslant 1 for and . ∎
In some cases we can improve on Corollary 28. We will only use a bound on .
Lemma 29**.**
Let be a connected graph of maximum degree . Let be a symmetric matrix with zero row sums such that, for , if and if , for some . Define for . Then, if ,
[TABLE]
Proof.
As in Corollary 28, we have , where is defined in Lemma 27. Moreover,
[TABLE]
where the maximum is taken over such that . Permuting if necessary, we can assume that the maximum occurs for with . Let , and for and , put . Observe that, for ,
[TABLE]
from which it follows that for ,
[TABLE]
taking in the sum. Since we have , so by the definition of we have
[TABLE]
Thus, defining as in Lemma 26, we have . Since , we have . Taking the set of edges guaranteed by Lemma 26 with , we find that
[TABLE]
To complete the numerical bound, continue as in the proof of Lemma 8; we omit the uninteresting details. ∎
5.3 Short paths
Lemma 30**.**
Let be a graph of maximum degree . Assume also that for some . For any two disjoint sets of vertices , denote
[TABLE]
Then, there exist at least \gamma\varDelta\lower 0.6458pt\hbox{\large\textstyle\frac{\min{|U_{1}|,|U_{2}|}}{2\ell(U_{1},U_{2})}} pairwise edge-disjoint paths in with one end in and the other end in of lengths bounded above by .
Proof.
Let be the number of vertices of . Denote . Without loss of generality we may assume that because we can always remove some vertices from the larger set. We call a path short if it has length at most . For a subgraph denote
[TABLE]
Starting from , we construct the required set of short paths by repeating the following procedure.
- (1)
If then do (2), otherwise STOP. 2. (2)
Find a path in of length at most
[TABLE]
Add to the set of constructed paths. Delete the edges of from and repeat from (1).
Suppose, we found fewer than paths by the procedure above, so that, in particular, we deleted less than edges. Therefore, for any such that ,
[TABLE]
Thus, .
Now, we explain why (1) implies the existence of a short path from to . Indeed, for , we have
[TABLE]
where denotes the neighbourhood of in . Since the number of edges from any vertex of to is bounded by , we get that
[TABLE]
Therefore, we can reach more than vertices starting from (or from ) by paths of length at most \log_{1+\gamma/2}\bigl{(}\lower 0.6458pt\hbox{\large\textstyle\frac{n}{2u}}\bigr{)}. Alternatively, since , we can reach more than vertices starting from by paths of length at most \log_{1+\gamma/2}\bigl{(}\lower 0.6458pt\hbox{\large\textstyle\frac{n}{\gamma\varDelta}}\bigr{)} (and the same holds for ). Therefore, we can find a vertex which is not too distant from both and and construct the required short path
Our procedure will stop at some moment since is finite. As shown above, this can only happen after we found at least edge-disjoint short paths from to . This completes the proof. ∎
5.4 Integration theorem
For the reader’s convenience, we quote [12, Lemma 4.6] and [12, Theorem 4.4] with very minor changes to match the notations of this paper.
If is a linear operator, let .
Lemma 31**.**
Let be linear operators such that and . Let denote the dimension of . Suppose and . For any , define
[TABLE]
Then, if the integrals exist,
[TABLE]
where
[TABLE]
Moreover, if for some then
[TABLE]
for any linear operators such that is equal to the identity operator on .
For a domain and a twice continuously-differentiable function , define
[TABLE]
For a complex number we denote by and the real and imaginary parts, respectively.
Theorem 32**.**
Let be nonnegative real constants with . Let be an positive-definite symmetric real matrix and let be a real matrix such that .
Let be a measurable set such that , and let , and be twice continuously-differentiable functions. We make the following assumptions.
- (a)
.
- (b)
For ,
* for and*
.
- (c)
For , . For , either
(i) for , or
(ii) for and
* .*
- (d)
* for .*
Let be a random variable with the normal density . Then, provided and are finite and is bounded in ,
[TABLE]
where, for some constant depending only on ,
[TABLE]
In particular, if and , we can take .
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] N. Alon and V. D. Milman, λ 1 subscript 𝜆 1 \lambda_{1} , isoperimetric inequalities for graphs, and superconcentrators, J. Combin. Theory Ser. B , 38 (1985) 73–88.
- 2[2] A. Barvinok and J. A. Hartigan, An asymptotic formula for the number of non-negative integer matrices with prescribed row and column sums, Trans. Amer. Math. Soc. , 364 (2012) 4323–4368.
- 3[3] R. A. Bradley and M. E. Terry, Rank analysis of incomplete block designs: I. the method of paired comparisons, Biometrika , 39 (1952) 324–345.
- 4[4] E. R. Canfield, C. Greenhill, and B. D. Mc Kay, Asymptotic enumeration of dense 0-1 matrices with specified line sums, J. Combin. Theory Ser. A , 115 (2008), 32–66.
- 5[5] L. R. Ford, Jr., Solution of a ranking problem from binary comparisons, Amer. Math. Monthly , 64, part 2 (1957) 28–33.
- 6[6] L. R. Ford Jr. and D. R. Fulkerson, Maximum flow through a network, Canad. J. Math. , 8 (1956) 399–404.
- 7[7] Z. Gao, B. D. Mc Kay and X. Wang, Asymptotic enumeration of tournaments with a given score sequence containing a specified digraph, Random Structures Algorithms , 16 (2000) 47–57.
- 8[8] C. S. Greenhill and B. D. Mc Kay, Random dense bipartite graphs and directed graphs with specified degrees, Random Structures Algorithms , 35 (2009) 222–249.
