A randomly weighted minimum spanning tree with a random cost constraint
Alan Frieze, Tomasz Tkocz

TL;DR
This paper analyzes the minimum spanning tree problem on a complete graph with edges having random weights and costs, establishing asymptotic optimal values under a cost constraint for various parameter ranges.
Contribution
It introduces a novel analysis of the minimum spanning tree problem with random weights and costs under a constraint, deriving asymptotic solutions using duality methods.
Findings
Asymptotic optimal weight values are derived for different parameters.
The study extends classical MST analysis to incorporate random costs and constraints.
Dual problem considerations are key to the analysis.
Abstract
We study the minimum spanning tree problem on the complete graph where an edge has a weight and a cost , each of which is an independent copy of the random variable where and is the uniform random variable. There is also a constraint that the spanning tree must satisfy . We establish, for a range of values for , the asymptotic value of the optimum weight via the consideration of a dual problem.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
A randomly weighted minimum spanning tree with a random cost constraint
Alan Frieze and Tomasz Tkocz
Department of Mathematical Sciences
Carnegie Mellon University
Pittsburgh PA15217
U.S.A Research supported in part by NSF grant DMS1661063
Abstract
We study the minimum spanning tree problem on the complete graph where an edge has a weight and a cost , each of which is an independent copy of the random variable where and is the uniform random variable. There is also a constraint that the spanning tree must satisfy . We establish, for a range of values for , the asymptotic value of the optimum weight via the consideration of a dual problem.
2010 Mathematics Subject Classification. 05C80, 90C27.
Key words. Random Minimum Spanning Tree, Cost Constraint.
1 Introduction
Let denoe the uniform random variable and let . We consider the minimum spanning tree problem in the context of the complete digraph where each edge has an independent copy of for weight and an independent copy of for cost . Let denote the set of spanning trees of . The weight of a spanning tree is given by and its cost is given by . The problem we study is
[TABLE]
where may depend on . We let denote the optimum value to (1).
The unconstrained case of this question () has been well studied: Frieze [6], Steele [16], Janson [12], Penrose [15], Frieze and McDiarmid [7], Frieze, Ruszinkó and Thoma [8], Beveridge, Frieze and McDiarmid [2], Li and Zhang [14] and Cooper, Frieze, Ince, Janson and Spencer [5] and is well understood. For example, [5] proves that if denotes the expected minimum weight of a spanning tree then
[TABLE]
for explicitly defined . Here and throughout, is the zeta function.
Equation (1) defines a natural problem that has been considered in the literature, in the worst-case rather than the average case. See for example Aggarwal, Aneja and Nair [1] and Guignard and Rosenwein [11] (for a directed version) and Goemans and Ravi [10].
We first consider the simpler case where . We need to make the following definitions:
[TABLE]
[TABLE]
where
[TABLE]
and is the gamma function.
Theorem 1**.**
The following hold w.h.p.:
- (1)
If
[TABLE]
then
[TABLE] 2. (2)
Suppose now that where is a positive constant.
- (i)
If then
[TABLE] 2. (ii)
If and if is the solution to
[TABLE]
then
[TABLE] 3. (3)
Suppose now that where is a positive constant.
- (i)
If then there is no feasible solution to (1). 2. (ii)
If and if is the solution to
[TABLE]
then
[TABLE]
For the case we will prove the following.
Theorem 2**.**
Suppose that
[TABLE]
Then the following holds w.h.p.
[TABLE]
where .
Note that and this implies that the expression in (12) is consistent with the expression in (6).
We will first concentrate on the case . After this, we will continue with the proof of Theorem 2. We note that a preliminary version containing the results for the case appeared in [9]. The weights and costs will therefore be uniform until we reach the more general case in Section 5. We will then prove Theorem 2 as stated and then show how to extend this result to a wider class of distribution via a simple coupling argument from Janson [13].
2 Outline Proof for
We tackle (1) by considering the dual problem:
[TABLE]
We note that
[TABLE]
We will show that w.h.p.
[TABLE]
Here is an abbreviation for as , assuming that .
We use a standard integral formula to compute in Section 3.1. This is straightforward, but lengthy. We then prove concentration around the mean in Section 3.2. We then use a result of [11] to show in Section 4 that in the cases discussed, the duality gap is negligible w.h.p.
2.1 Consistency in Theorem 1
Before continuing, we will check that the claims in Cases (2) and (3) are intuitively reasonable. First consider Case (2). If and if is the tree minimising then w.h.p. and .
We observe next that . This follows directly from
[TABLE]
It is shown in an appendix that
[TABLE]
As such has a Lipschitz continuous inverse. By inspection we see that .
Note also that (use L’Hôpital’s rule) and
[TABLE]
and so (7) and (8) are consistent with (i) when .
If then from the above properties of we see that (7) has a unique positive solution. We derive expression (8) below.
Now consider Case (3). If then w.h.p. there is no tree with . If , then , and
[TABLE]
This implies that (9) has a unique positive solution. We derive expression (10) below.
3 Evaluation of the dual problem
3.1 Expectation
Lemma 3**.**
Let and let be the total weight of a minimum spanning tree in the complete graph on vertices with each edge having weight , where and are i.i.d. random variables uniform on . We have
- a.
If , then
[TABLE] 2. b.
If , then
[TABLE] 3. c.
If , then
[TABLE]
The implied terms in the above expressions can be taken to be independent of . Also, we have not optimised all constants.
Proof.
Let be a minimum spanning tree. The starting point is Janson’s formula [12],
[TABLE]
where is the number of components in the random graph on vertices with the edge set . Since the are i.i.d., this is the random graph , with . Since , for , so the last integral can be taken from [math] to and after a change of variables , we get
[TABLE]
where
[TABLE]
where in the last expression denotes Lebesgue measure. An elementary computation (given in an appendix) yields
[TABLE]
For convenience, we also include an expression for the inverse function (we need this later when we change variables in integration).
[TABLE]
Now we can proceed with evaluating given by (22). First observe that if then we have
[TABLE]
This is because
[TABLE]
Therefore we can distinguish the following cases depending on the value of .
Case 1. . Note that then
[TABLE]
so by (25), the integration over the second and third range from (23) gives the contribution in (22). Consequently,
[TABLE]
By the same reason, we also have
[TABLE]
Thus
[TABLE]
Changing the variables yields
[TABLE]
It remains to deal with the integral \int_{0}^{1}\mbox{{\bf E}}\big{(}\kappa(G_{n,q})-1\big{)}\frac{\mathrm{d}q}{\sqrt{q}}. As before, thanks to (25), we have
[TABLE]
Decompose
[TABLE]
where is the number of components which are vertex trees, is the number of non-tree components on vertices and is the number of components on at least vertices. Here we set .
For the tree components, we have
[TABLE]
For and , we have and , hence
[TABLE]
Thus
[TABLE]
Setting gives
[TABLE]
Using as , for and , we have . Therefore
[TABLE]
If the integral was from [math] to , we could express it using the gamma function. Since, crudely on the domain of integration,
[TABLE]
and for on the right hand side we get , whereas for we get
[TABLE]
We can conclude that
[TABLE]
It remains to compute the sum over . We have
[TABLE]
Since for , , the series converges and we have
[TABLE]
where
[TABLE]
To bound the contribution form non-tree components, note that
[TABLE]
Thus
[TABLE]
so
[TABLE]
Finally, for the large components, since
[TABLE]
we get , so we have
[TABLE]
Combining (33), (36), (38) with (29) and plugging into (28), we obtain
[TABLE]
In view of (27) this gives (18).
Case 2. . Then plainly and . Since , for , in view of (25), the third range in (23), that is , gives the contribution in (22). For the remaining two ranges, changing the variables in (22) gives
[TABLE]
By (25), for the second integral we get
[TABLE]
so
[TABLE]
We again decompose as in (29). Here we set . First we show that the and have small contribution in the integrals above. By (35),
[TABLE]
and similarly
[TABLE]
By (37),
[TABLE]
Putting the last three estimates together with (39) yields
[TABLE]
Using (30) and repeating verbatim the arguments following it to bound , to change the variables and to replace with , we obtain
[TABLE]
As in Case 1, , so we can replace the integral with. Moreover, crude estimates show that
[TABLE]
Thus finally
[TABLE]
Note that in the first integral, we have , hence the main term (the sum over ) is lower-bounded by and consequently, the term can be incorporated into the term, which gives (19).
Case 3. . Then plainly and . Changing the variables in (22) yields
[TABLE]
Since , in view of (25), the third integral gives
[TABLE]
Similarly, for the second integral we have
[TABLE]
Thus we can write (we incorporate the term in )
[TABLE]
The expression in the bracket is exactly (39) with being replaced by . Therefore, from (19), we obtain (20). ∎
Lemma 4**.**
With the notation of Lemma 3, if , we have
[TABLE]
and with probability ,
[TABLE]
where and is the minimum spanning tree with weights .
Also in Case 3 we have
[TABLE]
where .
Proof.
The claims concerning follow directly from (18), (19), (20).
To justify (42), fix and let be the number of edges on the minimum spanning tree having weights above . By Janson’s formula from [12], with given by (23). By the first moment, . By (26), choosing such that gives , equivalently , with probability . It remains to bound . In Case 1, we see from (23) that , so . In Case 2 we see that we have to use the second formula in (23) and . Similarly in Case 3, , hence .
For (43), we note that . Putting we see that with the required probability, the random graph is connected. This implies that with the same probability there is a spanning tree with . It follows that a spanning tree that minimises will have . (Applying the greedy algorithm will finish before needing an edge with .) So and consequently . ∎
3.2 Concentration
The goal of this section is to prove the following lemma.
Lemma 5**.**
For a fixed and ,
[TABLE]
Proof.
Recall that \phi(\lambda)=\min\left\{W(T)+\lambda C(T):\text{T\in{\mathcal{T}} }\right\}-\lambda c_{0}=L_{n}(\lambda)-\lambda c_{0} (as defined in (13)).
In our analysis we consider separately the contribution of long and short edges. Let and let denote the total cost of the edges used on the minimum spanning tree with . Let and note that is a function of i.i.d. random variables .
We will show is concentrated using a variant of the Symmetric Logarithmic Sobolev Inequality from [3]. Let denote the same quantity as , but with the variable replaced by an independent copy . Then a simplified form of the Symmetric Logarithmic Sobolev Inequality [3, Corollary 3] says that if
[TABLE]
then for all ,
[TABLE]
and if
[TABLE]
then for all ,
[TABLE]
Changing the value of one edge can change the value of by at most , so . Let denote the indices of the edges which contribute to . If then implies . So
[TABLE]
Now where . Then, since there are less than terms in the first sum and less than terms in the second sum, we have
[TABLE]
If then we also have that implies . So we also have
[TABLE]
Therefore,
[TABLE]
where we have used , see Lemma 4 and are universal constants.
Let denote the total cost of the edges used with edge cost at least . We have from Lemma 4 that for some , with probability ,
[TABLE]
And so with probability . ∎
3.3 Optimising over
The first thing to observe is that is a concave function of , see for example Boyd and Vandenberghe [4]. This is because it is the minimum of a collection of linear functions. Ignoring the factor, it will be differentiable. It follows then that we can maximise by setting its (asymptotic) derivative to zero. On the other hand, by concentration is close to . We first maximize .
Lemma 6**.**
In cases (1), (2), (3) of Theorem 1, we respectively have
[TABLE]
[TABLE]
[TABLE]
Moreover, the maximizer in each case satisfies .
Proof.
For , we have
[TABLE]
Differentiating (ignoring the term) and setting it to zero we see that is maximised at
[TABLE]
and that . Note that for as in (1). This gives (50).
Now let where . We proceed as before. Putting and into the expression in (19) we get
[TABLE]
Differentiating w.r.t. we get
[TABLE]
and hence the solution to asymptotically satisfies . Clearly which implies that and so as claimed. Then (51) follows.
Finally, let where . In this case we put and proceed as before. Putting into the expression in (20) we get
[TABLE]
Differentiating w.r.t. we get
[TABLE]
and hence the solution to asymptotically satisfies . Clearly which implies that . Then (52) follows. ∎
To finish, we divide the interval (with being an appropriate universal constant) into sub-intervals of equal length less than . Suppose that the th interval is . We observe that for any spanning tree we have that for ,
[TABLE]
and so
[TABLE]
So, maximising over makes an error in maximising over of at most .
Using the concentration result (44) of Section 3.2, we see that for a fixed , there is with such that we have
[TABLE]
We see therefore that w.h.p. the expression for in (56) holds simultaneously for all . Therefore, by Lemma 6, we obtain in Case (1), (2), (3) of Theorem 1, respectively that
[TABLE]
[TABLE]
where is the unique solution to (see (7), (8)) and
[TABLE]
One final point. Our expressions for are only valid within a certain range. But because, is concave and we have a vanishing derivative, we know that the values outside the range cannot be maximal.
4 Proof of Theorem 1
We will use Theorem 3.1 from Goemans and Ravi [11]:
Theorem 7** ([11]).**
There exists a spanning tree such that and , where is the maximum cost of an edge of .
For Cases a and b from Lemma 3 we let where where is a suitable hidden constant for (42) and is the RHS of (42). Suppose now that we replace by and let denote the minimum weight of a tree with cost at most . Applying Theorem 7 we obtain a spanning tree such that and . It only remains to show that w.h.p. . This follows from our expressions for in Section 3.3 and the fact that , which we verify now.
In Case a we have from (53) that,
[TABLE]
In Case b we have , , and so .
For Case c we let and proceed as above. We find that once again because of the expression (59) for in Section 3.3 and the fact that . We then use Theorem 7 and (43) to show that
[TABLE]
This completes the proof of Theorem 1.
5 More general distributions
We now consider the case where we have distributed as independent copies of , . We follow the same ideas as for , but there are technical difficulties. Let us first though explain the need for the lower bound on in Theorem 2, up to a logarithmic factor.
Lemma 8**.**
Let be independent copies of and let . Then
[TABLE]
Proof.
[TABLE]
∎
It follows from (60) that the expected weight of a minimum spanning tree is . To see this, orient the edges of the minimum weight spanning tree away from vertex 1. Associate each edge with its tail (closest to vertex 1). Then each edge has expected weight at least that given in Lemma 8.
We can use the argument of Section 3.2 with to show concentration around the mean. Because , the R.H.S.’s of (46), (47) become . Consequently (48) becomes
[TABLE]
Now if then . So, with probability , the edges of weight at most induce a connected graph and we have that . Plugging this into (61) we see that
[TABLE]
We have and so with probability . In conclusion, w.h.p.
We now turn to estimating the dual value, the equivalent of Lemma 3.
5.1 Expectation
In this section, we estimate the expected weight of the minimum spanning tree with edge weights for independent copies of .
Lemma 9**.**
Let , and let be the total weight of a minimum spanning tree in the complete graph on vertices with each edge having weight , where and are i.i.d. copies of . Assuming
[TABLE]
we have
[TABLE]
where
[TABLE]
The implied terms in the above expressions can be taken to be independent of . Also, we have not optimised all constants.
Proof.
We follow closely the proof of Lemma 3 which concerns . Janson’s formula (21) gives
[TABLE]
where
[TABLE]
Case 1, :
[TABLE]
If then
[TABLE]
Let
[TABLE]
(our assumption on is chosen such that this is possible, i.e. this value of is less than one). Then, thanks to (25),
[TABLE]
It remains to handle the last integral. Repeating verbatim all the computations of Lemma 3 from (29) to (38) (the only difference being that is replaced by in the integrand), we get
[TABLE]
where the error terms come from appropriate changes in (31) (36), (38). The constant comes from (32) and equals
[TABLE]
Plugging this back into (68), we conclude that
[TABLE]
with
[TABLE]
Case 2, : We set in (65) which yields
[TABLE]
and now (because and are assumed to have the same distribution), so using the previously analysed case for , we get
[TABLE]
This completes the proof of the lemma. ∎
5.2 Concentration
We follow the argument of Section 3.2.
Lemma 10**.**
Let . Then,
[TABLE]
Proof.
Let . We argue that where , giving (46) and (47) as before. It then follows that
[TABLE]
Plugging (70) into the RHS of (71) and noting that , we obtain
[TABLE]
Now because , where is as in (67), we see that with probability . ∎
We divide the interval into sub-intervals as before and optimise the by maximising
[TABLE]
Solving we get
[TABLE]
Observe that our assumptions on imply that satisfies (62).
After this, we can follow the proof of the case . We only need to check now that the argument of Section 4 is still valid. We know that with probability that for all edges of the minimum spanning tree. Here is as defined in (67) and we note that . This follows from
[TABLE]
We may therefore proceed as in Section 4 with and this completes the proof of Theorem 2.
6 Conclusion
We have determined the asymptotic optimum value to Problem (1) w.h.p. The proof is constructive in that we can w.h.p. get an asymptotically optimal solution (1) by computing of the previous section. When weights and costs are uniform , our theorem covers almost all of the possibilities for , although there are some small gaps between the 3 cases. Our results for more general distributions have a more limited range and further research is needed to extend this part of the paper. We have also considered more general classes of random variable and here we have a more limited range for .
The present result assumes that cost and weight are independent. It would be more reasonable to assume some positive correlation. This could be the subject of future research. One could also consider more than one constraint, but then we might lose Theorem 7.
Appendix A Proof of (17)
We want to show that is strictly decreasing on , where
[TABLE]
We have
[TABLE]
Call the right hand side . We want to show that it is positive for every . We have , so it is enough to show that is positive for every . We have
[TABLE]
and want to show that the sum on the right hand side is positive for every . Note that for , we have for every , so the sum is positive in this case. Let . Separating the first two terms, we rewrite the condition that the sum is positive as
[TABLE]
Equivalently, multiplying by , we want to show that for every ,
[TABLE]
Let . Estimating crudely , using and then bounding for , we get
[TABLE]
Moreover, we have
[TABLE]
(shown below) which finishes the proof in this case.
Let . Estimating crudely , using and then bounding \big{(}\beta e^{1-\beta}\big{)}^{k-2}<\beta e^{1-\beta} for , we get
[TABLE]
where it can be checked numerically that . Moreover, we have
[TABLE]
(shown below) which finishes the proof in this case.
It remains to prove (73) and (74).
Showing (73) is equivalent to showing that the function
[TABLE]
is positive on . We numerically check that and it suffices to show that is decreasing on . We find that
[TABLE]
Call the right hand side . We have and for ,
[TABLE]
which shows that decreases, hence is negative, hence is negative, hence decreases.
Showing (74) is equivalent to showing that the function
[TABLE]
is positive on . For , we have
[TABLE]
(we used that decreases on ). This shows that increases on , hence for .
Appendix B Proof of (23)
We need to compute the surface area of the subset of the unit square . The line intersects the and axes respectively at and . Thus when both and are less than , the subset is a right triangle whose area is . This gives the formula in the first case of (23). When exactly one of and is less than and the other one is greater than , the subset is a trapezoid and computing its area gives the formula in the second case of (23). Finally, if both and are greater than , the subset is the complement of a right triangle and the formula in the third case of (23) follows from the first one by changing to and taking the complement.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] V. Aggarwal, Y. Aneja and K. Nair, Minimal spanning tree subject to a side constraint , Computer and Operations Research 9 (1982) 287-296.
- 2[2] A. Beveridge, A. M. Frieze and C. J. H. Mc Diarmid, Minimum length spanning trees in regular graphs , Combinatorica 18 (1998) 311–333.
- 3[3] S. Boucheron, G. Lugosi, and P. Massart, Concentration inequalities using the entropy method, Annals of Probability , 31 (2003) 1583-1614.
- 4[4] S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge University Press, 2004.
- 5[5] C. Cooper, A.M. Frieze, N. Ince, S. Janson and J. Spencer, On the length of a random minimum spanning tree , Combinatorics, Probability and Computing 25 (2016) 89-107.
- 6[6] A. M. Frieze, On the value of a random minimum spanning tree problem , Discrete Applied Mathematics 10 (1985) 47–56.
- 7[7] A. M. Frieze and C. J. H. Mc Diarmid, On random minimum length spanning trees , Combinatorica 9 (1989) 363–374.
- 8[8] A. M. Frieze, M. Ruszinkó and L. Thoma, A note on random minimum length spanning trees , Electronic Journal of Combinatorics 7 (2000) R 41.
