Dynamic Averaging Load Balancing on Arbitrary Graphs
Petra Berenbrink, Lukas Hintze, Hamed Hosseinpour, Dominik Kaaser,, Malin Rau

TL;DR
This paper analyzes dynamic load balancing on arbitrary graphs using averaging over matchings, providing bounds on load discrepancy for both discrete and continuous loads, and introduces a novel drift analysis technique.
Contribution
It presents the first analysis of discrete and dynamic averaging load balancing on general graphs, employing a new drift technique linked to electrical network resistance.
Findings
Bounds discrepancy for various matching models
Applies to broad class of graphs
Introduces drift analysis method
Abstract
In this paper we study dynamic averaging load balancing on general graphs. We consider infinite time and dynamic processes, where in every step new load items are assigned to randomly chosen nodes. A matching is chosen, and the load is averaged over the edges of that matching. We analyze the discrete case where load items are indivisible, moreover our results also carry over to the continuous case where load items can be split arbitrarily. For the choice of the matchings we consider three different models, random matchings of linear size, random matchings containing only single edges, and deterministic sequences of matchings covering the whole graph. We bound the discrepancy, which is defined as the difference between the maximum and the minimum load. Our results cover a broad range of graph classes and, to the best of our knowledge, our analysis is the first result for discrete and…
| Graph | |||
|---|---|---|---|
| Section B.5 | Section C.1 | Section D.1 | |
|
-regular graph
(const. ) |
|||
| cycle | |||
| 2-D torus | |||
|
-D torus
(const. ) |
|||
| hypercube |
| Graph | |
|---|---|
| Corollary C.2 | |
|
-regular graph
(const. ) |
|
| cycle | |
| 2-D torus | |
|
-D torus
(const. ) |
|
| hypercube |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Dynamic Averaging Load Balancing on Arbitrary Graphs
Petra Berenbrink111Universität Hamburg, Germany, Lukas Hintze1, Hamed Hosseinpour1,
Dominik Kaaser222TU Hamburg, Germany, Malin Rau1
Abstract
In this paper we study dynamic averaging load balancing on general graphs. We consider infinite time and dynamic processes, where in every step new load items are assigned to randomly chosen nodes. A matching is chosen, and the load is averaged over the edges of that matching. We analyze the discrete case where load items are indivisible, moreover our results also carry over to the continuous case where load items can be split arbitrarily. For the choice of the matchings we consider three different models, random matchings of linear size, random matchings containing only single edges, and deterministic sequences of matchings covering the whole graph. We bound the discrepancy, which is defined as the difference between the maximum and the minimum load. Our results cover a broad range of graph classes and, to the best of our knowledge, our analysis is the first result for discrete and dynamic averaging load balancing processes. As our main technical contribution we develop a drift result that allows us to apply techniques based on the effective resistance in an electrical network to the setting of dynamic load balancing.
33footnotetext: Petra Berenbrink, Hamed Hosseinpour, Malin Rau: Supported by DFG Research Group ADYN (FOR 2975) under grant DFG 41136273544footnotetext: Petra Berenbrink, Hamed Hosseinpour: Supported by the DFG under grant 427756233
1 Introduction
Parallel and distributed computing is ubiquitous in science, technology, and beyond. Key to the performance of a distributed system is the efficient utilization of resources: in order to obtain a substantial speed-up it is of utmost importance that all processors have to handle the same amount of work. Unfortunately, many practical applications such as finite element simulations are highly “irregular”, and the amount of load generated on some processors is much larger than the amount of load generated on others. We therefore investigate load balancing to redistribute the load. Efficient load balancing schemes have a plenitude of applications, including high performance computing [45], cloud computing [39], numerical simulations [37], and finite element simulations [41].
In this paper we consider neighborhood load balancing on arbitrary graphs with nodes, where the nodes balance their load in each step only with their direct neighbors. We assume discrete load items as opposed to continuous (or idealized) load items which can be broken into arbitrarily small pieces. We study infinite and dynamic processes where new load items are generated in every step. We consider two different settings. In the synchronous setting load items are generated on randomly chosen nodes. Then a matching is chosen and the load of the nodes is balanced (via weighted averaging) over the edges of that matching. Here we further distinguish between two matching models. We consider the random matching model where linear-size matchings are randomly chosen, and the balancing circuit model where the graph is divided deterministically into many matchings. Here is the maximum degree of any node. In the asynchronous model exactly one load item is generated on a randomly chosen node. In turn, the node chooses one of its edges at random and balances its load with the corresponding neighbor. This model can be regarded as a variant of the synchronous model where the randomly chosen matching has size one. It was introduced by [4] where the authors show results for cycles assuming continuous load. Our goal is to bound the so-called discrepancy, which is defined as the maximal load of any node minus the minimal load of any node.
Results in a Nutshell
In this paper we present, for the three models introduced above, bounds on the expected discrepancy and bounds that hold with high probability. Our bounds for the synchronous model with balancing circuits hold for arbitrary graphs , the bounds for the asynchronous model and the synchronous model with random matchings hold for regular graphs only. For the asynchronous model and the model with random matchings our bounds on the discrepancy are expressed in terms of hitting times of a standard random walk on , as well as in terms of the spectral gap of the Laplacian of . For the synchronous model with balancing circuits we express our bounds in terms of the global divergence. This can be thought of as a measure of the convergence speed of the Markov chains modeling a random walk on . However, it does not directly measure the speed of convergence of the chain. It accounts for the time period in which the chain keeps a given distance from the stationary (and uniform) distribution. In physics terminology, it is a measure of total absement, which is the time-integral of displacement.
For all three infinite processes our bounds on the discrepancy hold at an arbitrary point of time as long as the system is initially empty. Otherwise, the bounds hold after an initial time period, its length is a function of the initial discrepancy. In the following we give some exemplary results assuming that the system is initially empty and . For the synchronous model with random matchings and the asynchronous model we can bound the discrepancy by for any regular graph . Our results show a polylogarithmic bound on the discrepancy for all regular graphs with a hitting time at most (e.g., the two-dimensional torus or the hypercube). In all models we can bound the discrepancy by for arbitrary constant-degree regular graphs. For the full results we refer the reader to Theorem 3.1, Theorem 4.1, and Theorem 5.1. We give a detailed overview on the results on specific graph classes in Table 1 in Section 7.
All bounds presented in this paper also hold for the corresponding continuous processes without rounding. The authors of [4] consider the asynchronous process on cycles in the continuous setting where the load items can be divided into arbitrary small pieces. They bound the expected discrepancy and show that for a cycle with nodes. In contrast, we improve that bound for the cycle to . Note that our result not only bounds the expected discrepancy but it also holds with high probability.
Our main analytical vehicle is a drift theorem that bounds the tail of the sum of a non-increasing sequence of random variables. Our drift theorem adapts known drift results from the literature, similarly to the Variable Drift Theorem in [31].
1.1 Related Work
There is a vast body of literature on iterative load balancing schemes on graphs where nodes are allowed to balance (or average) their load with neighbors only. One distinguishes between diffusion load balancing where the nodes balance their load with all neighbors at the same time and the matching model (or dimension exchange) model where the edges which are used for the balancing form a matching. In the latter model every resource is only involved in one balancing action per step, which greatly facilitates the analysis.
In this overview we only consider theoretical results and, as it is beyond the scope of this work to provide a complete survey, we focus on results for discrete load balancing. For results about continuous load balancing see, for example, [18, 29]. There are also many results in the context of balancing schemes where not the resources try to balance their load but the tokens (acting as selfish players) try to find a resource with minimum load. See [22] for a comprehensive survey about selfish load balancing and [2, 27, 12] for some recent results. Another related topic is token distribution where nodes do not balance their entire load with neighbors but send only single tokens over to neighboring nodes with a smaller load. See [24, 7, 42] for the static setting and [6] for the dynamic setting.
Discrete Models
The authors of [40] give the first rigorous result for discrete load balancing in the diffusion model. They assume that the number of tokens sent along each edge is obtained by rounding down the amount of load that would be sent in the continuous case. Using this approach they establish that the discrepancy is at most after steps, where is the initial discrepancy. Similar results for the matching model are shown in [25]. While always rounding down may lead to quick stabilization, the discrepancy tends to be quite large, a function of the diameter of the graph. Therefore, the authors of [43] suggest to use randomized rounding in order to get a better approximation of the continuous case. They show results for a wide class of diffusion and matching load balancing protocols and introduce the so-called local divergence, which aggregates the sum of load differences over all edges in all rounds. The authors prove that the local divergence gives an upper bound on the maximum deviation between the continuous and discrete case of a protocol. In [23] the authors show several results for a randomized protocol with rounding in the matching model. For complete graphs their results show a discrepancy of after steps. Later, [8] extended some of these results to the diffusion model. In [44] the authors show that the number of rounds needed to reach constant discrepancy is w.h.p. bounded by a function of the spectral gap of the relevant mixing matrix and the initial discrepancy. In [9] the authors propose a very simple potential function technique to analyze discrete diffusion load balancing schemes, both for discrete and continuous settings. In [10] the authors investigate a load balancing process on complete graphs. In each round a pair of nodes is selected uniformly at random and completely balance their loads up to a rounding error of .
The authors of [15] study load balancing via matchings assuming random placement of the load items. The initial load distribution is sampled from exponentially concentrated distributions (including the uniform, binomial, geometric, and Poisson distributions). The authors show that in this setting the convergence time is smaller than in the worst case setting. Regardless of the graph’s topology, the discrepancy decreases by a factor of within synchronous rounds. Their approach of using concentration inequalities to bound the discrepancy (in terms of the squared -norm of the columns of the matrices underlying the mixing process) strongly influenced our approach.
Dynamic Models
There are far less results for the dynamic setting where new load enters the system over time. In [4] the authors study a model similar to our asynchronous model. In each step one load item is allocated to a chosen node. In the same step the chosen node picks a random neighbor, and the two nodes balance their loads by averaging them (continuous model). The authors show that the expected discrepancy is bounded by , as well as a lower bound on the square of the discrepancy of . The authors of [5] consider load balancing via matchings in a dynamic model where the load is, in every step, distributed by an adversary. They show the system is stable for sufficiently limited adversaries. They also give some upper bounds on the maximum load for the somewhat more restricted adversary. The authors of [11] consider discrete dynamic diffusion load balancing on arbitrary graphs. In each step up to load items are generated on arbitrary nodes (the allocation is determined by an adversary). Then the nodes balance their load with each neighbor and finally one load item is deleted from every non-empty node. The authors show that the system is stable, which means that the total load remains bounded over time (as a function of alone and independently of the time ).
2 Balancing Models and Notation
We consider the following class of dynamic load balancing processes on -regular graphs with nodes . Each process is modeled by a Markov chain , where the load vector is the state of the process at the end of step , and is the load of node at time . We measure a load vector’s imbalance by the discrepancy , which is the difference between the maximum load and the minimum load .
We consider two balancing processes, the synchronous process SBal and the asynchronous process ABal. Both processes are parameterized by a balancing parameter determining the balancing speed and a matching distribution . For SBal, is a distribution over linear-sized matchings of . For ABal, is a distribution over edges of . SBal is additionally parameterized by the number of load items allocated in each round. ABal allocates only one new load item per step.
Synchronous Processes
The synchronous process works as follows. The process first allocates items to randomly chosen nodes. Then it uses the matching distribution to determine the matching which is applied. Finally it balances the load over the edges of the matching (see Process described below). The parameter controls the fraction of the load difference that is sent over an edge in a step.
For the synchronous process SBal we consider two families of matching distributions, random matchings () and balancing circuits (). is generated according to the following method described in [25]. First an edge set is formed by including each edge with probability , independently from all other edges. Then a linear-sized matching is computed locally. We will use capital for randomly chosen matchings. The analysis for the random matching model can be found in Section 3. In the balancing circuit model we assume is covered by fixed matchings . deterministically chooses matchings in periodic manner such that in step the matching is chosen. We will use small for deterministically chosen matchings. The analysis for the balancing circuit model can be found in Section 4.
Asynchronous Process
The asynchronous process works as follows. The process first uses to generate a matching, this time containing one edge only. The distribution we consider, , first chooses a node uniformly at random and then it chooses one of the nodes’ edges uniformly at random. Finally one new token is assigned to either node or and then the edge is used for balancing (see ). Note that for the load allocation heavily depends on the edges which are used for balancing. This makes the analysis for this model quite challenging. In contrast, in the load allocation and the balancing are independent. Note that in the case of -regular graphs is equivalent to the uniform distribution over all edges or to choosing a random matching of size one. We analyze the asynchronous model in Section 5.
: In each round :
Allocate discrete, unit-sized load items to the nodes uniformly and independently at random. Define as the number of tokens assigned to node .
Sample a matching according to .
Balance with applied to , .
: In each round :
Select an edge according to .
Allocate a single unit-size load item to either node or with a probability of .
I.e., with prob. set and for all , otherwise set and for all .
Balance with applied to , where includes just the edge .
: For each edge in the matching balance loads of and :
Assume w.l.o.g. that .
Let p=\frac{\beta\cdot(X_{i}(t)-X_{j}(t))}{2}-\mathopen{}\mathclose{{}\left\lfloor\frac{\beta\cdot(X_{i}(t)-X_{j}(t))}{2}}\right\rfloor.
Then, node sends load items to node where
L_{i,j}\coloneqq\begin{cases}\mathopen{}\mathclose{{}\left\lceil\frac{\beta\cdot(X_{i}(t)-X_{j}(t))}{2}}\right\rceil,&\text{with probability }p,\\[5.0pt] \mathopen{}\mathclose{{}\left\lfloor\frac{\beta\cdot(X_{i}(t)-X_{j}(t))}{2}}\right\rfloor,&\text{with probability }1-p.\end{cases}
In the idealized setting, where the load is continuously divisible, a load of is sent from node to node .
2.1 Notation
We are given an arbitrary graph with nodes. We mainly assume that is regular and write for the node degree. Recall that the process is modeled by a Markov chain , where is the load vector at the end of step , and is the load of node at time . We write for the number of load items allocated to node in step and define . We will use upper case letters such as and to denote random variables and random matrices and lower case letters (like , ) for fixed outcomes. If clear from the context we will omit from a random variable.
We model the idealized balancing step in round by multiplication with a matrix given by
[TABLE]
We will omit the parameter if it is clear from context. With slight abuse of notation we use the same symbol for the matching itself and the associated balancing matrix and refer to both as just “matchings”. Furthermore, we write for their edges. For the product of all matching matrices from time to time we write
[TABLE]
where for we consider this to be the identity matrix. We generally refer to these matrices as mixing matrices. Moreover, we write for the sequence of matching matrices and analogously for a fixed sequence of matching matrices . We will write for the vector forming the th row of the matrix (which we often treat as a column vector despite it being a row).
In the balancing circuit model we define the round matrix as the product of the matching matrices forming a complete period of the balancing circuit. Note that has no relation to the minimum or maximum degree, although we may assume w.l.o.g. that each edge is covered by at least one of the matchings. We write for the spectral gap of the round matrix , i.e., for the difference between the largest two eigenvalues of .
We write for the vector of additive rounding errors in round . Then is the difference between the load at node after step and the load at node after step in an idealized scheme where loads are arbitrarily divisible.
Putting all of this together we can express the load vector at the end of step as
[TABLE]
We write for the hitting time of , which is the maximum expected time it takes for a standard random walk on (i.e., the walk moves to a neighbor chosen uniformly at random in each step) to reach a given node from a given node , with the maximum taken over all such pairs of nodes. We write for the edge hitting time of , which is defined like the hitting time, except that the maximum is taken over adjacent nodes only. We write for the normalized Laplacian matrix of a graph . For regular graphs it may be defined as , where is the adjacency matrix of . Writing for the real eigenvalues of , we let be the spectral gap of the Laplacian of .
3 Random Matching Model
In this section we analyze the process for -regular graphs , where the matching distribution is generated by the algorithm given in [25]. Note that the result (as well as the results for the two other models) holds at any point of time if the system is initially empty. Furthermore, we can show the same results in the idealized setting where load items can be divided into arbitrarily small pieces (see [4]). For more details we refer the reader to the paragraph directly after Eq. 3.
Theorem 3.1**.**
Let be a -regular graph and define T(G)\coloneqq\min\Big{\{}\frac{\operatorname{t_{\mathrm{hit}}(G)}}{n}\cdot\log(n),\sqrt{\frac{d}{\operatorname{\lambda}(\operatorname{\mathbf{L}}(G))}},\frac{1}{\operatorname{\lambda}(\operatorname{\mathbf{L}}(G))}\Big{\}}. Let be the state of process at time with . There exists a constant such that for all it holds w.h.p.111The expression with high probability (w.h.p.) denotes a probability of at least . and in expectation
[TABLE]
Proof.
We first expand the recurrence of Eq. 1 (cf. [43]). After one step we get
[TABLE]
We repeatedly expand this form up to the beginning of the process and get
[TABLE]
We write , , and for the three terms as indicated. Note that in general these terms are vectors of real numbers. The sum can be regarded as the contribution of an idealized process, where is the contribution of the initial load and is the contribution of the dynamically allocated load. Thus, is the deviation between the idealized process without rounding and the discrete process described in Section 2.
To bound the discrepancy of the load vector at time we use the fact that the discrepancy is sub-additive such that (see B.1 in Appendix B). Hence, to bound we individually bound the discrepancies of the three terms in Eq. 2 and get
[TABLE]
If the system is initially empty, then . Moreover, in the idealized setting without rounding . Techniques to bound the first term and the last term are well-established. We state the corresponding results in LABEL:lem:initial:load:vanishes and LABEL:lem:rounding:errors:are:small directly below the proof of our theorem. The main part of the proof is to bound , which will be done in Section 3.1.
Let now . First, it follows from LABEL:lem:initial:load:vanishes that for all we have with probability at least . Second, it follows from Lemma 3.4 that with probability at least . Third, it follows from LABEL:lem:rounding:errors:are:small that
[TABLE]
with probability at least . The statement of the theorem therefore follows from a union bound over the statements of LABEL:lem:initial:load:vanishes, LABEL:lem:rounding:errors:are:small, and Lemma 3.4. The bound on expectation follows analogously from the linearity of expectation and the bounds on the expected discrepancies in the aforementioned lemmas. ∎
Intuitively, LABEL:lem:initial:load:vanishes states that the contribution of the initial load to the discrepancy is insignificant if is large enough. We generalize the analysis of Theorem 1 [43] (or Theorem 2.9 in [44]) to establish a bound on the discrepancy of the initial load as a function of . For the sake of completeness the proof of LABEL:lem:initial:load:vanishes is given in Section B.1.
Lemma 3.2** (name=Memorylessness Property,restate=restateInitialLoadVanishes,label=lem:initial:load:vanishes).**
Let be a -regular graph. Let . Then there exists a constant such that for all and with t\geq t_{0}(\gamma)\coloneqq c\cdot\max\mathopen{}\mathclose{{}\left\{\gamma\log(n),\log(K\cdot n)}\right\}\cdot\smash[b]{\frac{1}{\operatorname{\lambda}(\operatorname{\mathbf{L}}(G))\cdot\beta}} we get with probability at least and in expectation
[TABLE]
The next lemma bounds , the discrepancy contribution of cumulative rounding errors. Note that this result does not just hold for the random matching model, but for all the three models that we consider in this paper. In the proof of the lemma we extend then results of Theorem 3.6 in [44] (which is based on work in [8]) to establish a bound as a function of . The proof is given in Section B.2.
Lemma 3.3** (name=Insignificance of Rounding Errors,restate=restateRoundingErrorsAreSmall,label=lem:rounding:errors:are:small).**
Let be an arbitrary graph. Then for all , , and we get with probability at least and in expectation
[TABLE]
To bound , the discrepancy contribution of dynamically allocated load items we apply the next lemma. It is in fact the core of our work. We prove it in Section 3.1.
Lemma 3.4** (Contribution of Dynamically Allocated Load).**
Let be a -regular graph. Define T(G)\coloneqq\min\mathopen{}\mathclose{{}\left\{\operatorname{t_{\mathrm{hit}}(G)}\cdot\log n/{n},\sqrt{d/{\operatorname{\lambda}(\operatorname{\mathbf{L}}(G))}},1/{\operatorname{\lambda}(\operatorname{\mathbf{L}}(G))}}\right\}. Then for all and we get with probability at least and in expectation
[TABLE]
3.1 Bounding the Contribution of Dynamically Allocated Load
In this section we prove Lemma 3.4. Some of the proofs are omitted and can be found in Section B.3. As a first step, we bound using the global divergence , which is defined over a sequence of matching matrices as
[TABLE]
The global divergence can be regarded as a measure of the convergence speed of a random walk that uses the matching matrices as transition probabilities. In [23, 44, 8] the authors use a related notion which they call the local -divergence, also defined on a sequence of matchings . The difference lies in the fact that the global divergence, essentially, measures differences between nodes’ values and a global average, while the local divergence measures differences between neighboring nodes. To show Lemma 3.4 we first observe the following.
Observation 3.5**.**
It holds that .
Next we consider a fixed node and show a concentration inequality on in terms of , where is the sequence of matchings applied by our process (Lemma 3.6). Note that in the lemma we assume the matchings are fixed and the randomness is due to the random load placement only. Hence, the lemma directly applies to . Afterwards, we bound the global divergence of the random sequence of matchings, in terms of a notion of “goodness” of the used matching distribution , for the random sequence of matchings (LABEL:lem:glob:div:bound:drift), and then bound the “goodness” of the distribution used in the random matching model (Lemma 3.10). We start with a bound on the deviation of from the average load in terms of .
Lemma 3.6** (Load Concentration).**
Let be an arbitrary sequence of matchings. Then for all , , and we get with probability at most
[TABLE]
Proof.
Our goal is to decompose into a sum of independent random variables. Recall that we assume that the matching matrices are fixed and all randomness is due to the random choices of the load items. This will enable us to apply a concentration inequality to this sum. For the decomposition observe that where is the random load vector corresponding to the load items allocated at time . So the th coordinate of is We define the indicator random variable for and as
[TABLE]
Note that for fixed and we have , {\operatorname{\mathbb{P}}}\mathopen{}\mathclose{{}\left[{B}({\tau,j,w})=1}\right]=1/n and . Observe that , the load allocated to node at step , can be expressed as . Merging this with the value of gives
[TABLE]
For a fixed and we define . This random variable measures the contribution of -th load item of round to . Note that the load items are allocated independently from each other. Since are fixed matrices, then and are independent for all and and . To apply the concentration inequality from Theorem A.14 we need to show that and compute an upper bound on . Showing the first condition is easy since exactly one of the indicator random variables is one and has a value between zero and one.
It remains to consider the variance of . First note that by linearity of expectation
[TABLE]
where the last equality follows form the fact that is doubly stochastic. Now we get
[TABLE]
where we used that for each and each exactly one of the is one and all others are zero, and each of the possible cases has uniform probability.
Recall that and are independent for all and . Hence we get
[TABLE]
where the final equality uses the definition of the global divergence . Applying Theorem A.14 with and with results in
[TABLE]
The lower bound can be established using Theorem A.15 (with and ) instead of Theorem A.14. Via a union bound we get
[TABLE]
To bound the global divergence of the matching sequence used by the process we use two potential functions. The quadratic node potential is given by
[TABLE]
For a set of edges on the nodes and a vector , the quadratic edge potential is
[TABLE]
We may also write whenever is a graph, and whenever is a matching matrix. The following observation relates the drop of node potential to the edge potential in terms of .
Observation 3.7** (name=,label=obs:node_potential_change_exact,restate=restateObsPotentialRelation).**
Let be a matching matrix with parameter . Then for any we have .
We now define a notion of a matching distribution being good. In LABEL:lem:glob:div:bound:drift below we show that the notion is sufficient for showing that matching sequences generated from such distributions have bounded global divergence. Note that the “goodness” of a distribution does not depend on but on graph properties and the random choices with which the matchings are chosen. Hence, we assume .
Theorem 3.8**.**
Assume is an arbitrary -regular graph. Let be an increasing function and let . Then a matching distribution is -good if the following conditions hold for and all stochastic vectors .
2. 2.
{\operatorname{\mathrm{Var}}[\operatorname{\Phi}(\mathbf{M}^{1}\cdot\vec{x})]}\leq(\sigma^{2}-1)\cdot\mathopen{}\mathclose{{}\left(\operatorname{\Phi}(\vec{x})-{\operatorname{\mathbb{E}}[\operatorname{\Phi}(\mathbf{M}^{1}\cdot\vec{x})]}}\right)^{2}.
It remains to show two results. First, assuming a matching distribution is -good, the global divergence of a matching sequence generated by that distribution can be bounded in terms of and (LABEL:lem:glob:div:bound:drift). Second, we have to calculate a function and the values of for which the matching distribution is -good (see Lemma 3.10).
Lemma 3.9** (name=Global Divergence,label=lem:glob:div:bound:drift,restate=restateLemGlobalDivergence).**
Assume is an arbitrary graph. Let be an increasing function, , and . Let be an i.i.d. sequence of matching matrices generated by and assume is a -good matching distribution. Then for all and we get with probability at least
[TABLE]
Lemma 3.10**.**
Assume is an arbitrary -regular graph. Let
[TABLE]
Then is -good.
Proof.
First, note that the function is increasing in . Applying the first part of LABEL:prop:node_potential_change_statistics (see below) we get that for any vector it holds that
[TABLE]
From the first two statements of LABEL:lem:edge_potential_bounds (stated behind LABEL:lem:edge_potential_bounds) we see that for and all stochastic vectors
[TABLE]
Hence,
[TABLE]
and as a consequence, by the definition of .
It remains to check the second condition of Definition 3.8 with our claimed value . Inserting its value as stated in the lemma, the condition requires that
[TABLE]
which is given in the second part of LABEL:prop:node_potential_change_statistics (see below). ∎
In LABEL:prop:node_potential_change_statistics we first relate the drop of to the quadratic edge potential . In the second part we bound the variance of the potential drop as a function of the edge hitting time.
Lemma 3.11** (label=prop:node_potential_change_statistics,restate=restateLemNodePotentialChangeStatistics).**
Let be a -regular graph, let , and let , then
\operatorname{\Phi}(\vec{x})-{\operatorname{\mathbb{E}}\mathopen{}\mathclose{{}\left[\operatorname{\Phi}(\mathbf{M}^{1}\cdot\vec{x})}\right]}\geq\frac{1}{16d}\cdot\operatorname{\Psi}_{G}(\vec{x}).** 2. 2.
{\operatorname{\mathrm{Var}}\mathopen{}\mathclose{{}\left[\operatorname{\Phi}(\mathbf{M}^{1}\cdot\vec{x})}\right]}\leq(32\cdot(\operatorname{t^{*}_{\mathrm{hit}}(G)}/n)+4)\cdot\mathopen{}\mathclose{{}\left(\operatorname{\Phi}(\vec{x})-{\operatorname{\mathbb{E}}\mathopen{}\mathclose{{}\left[\operatorname{\Phi}(\mathbf{M}^{1}\cdot\vec{x})}\right]}}\right)^{2}.**
In LABEL:lem:edge_potential_bounds we relate the size of the quadratic edge potential to the second-largest eigenvalue of , the effective resistance of and node potential. To state it, we need some additional definitions. For any two nodes and of the graph is the effective resistance (or resistive distance) between and in (for a detailed definition see Section A.1). Furthermore, we write for the resistive diameter of , i.e., the largest resistive distance between any pair of nodes in , and write for the maximum effective resistance between any pair of nodes adjacent in . I.e., and . The first part of the following lemma was previously shown in [25, 44].
Lemma 3.12** (label=lem:edge_potential_bounds,restate=restateEdgePotentialBounds).**
Let , and let be a connected -regular graph.
. 2. 2.
If is stochastic, then \operatorname{\Psi}_{G}(\vec{x})\geq\max\mathopen{}\mathclose{{}\left\{\frac{1}{\mathrm{Res}(G)}\cdot\operatorname{\Phi}(\vec{x})^{2},\frac{4}{27}\cdot\operatorname{\Phi}(\vec{x})^{3}}\right\} 3. 3.
**
Proof of Lemma 3.4
Proof.
Define g_{G}(x)=\frac{1}{16d}\cdot\max\mathopen{}\mathclose{{}\left\{d\cdot\operatorname{\lambda}(\operatorname{\mathbf{L}}(G))\cdot x,x^{2}/\mathrm{Res}(G),4x^{3}/27}\right\} and let . Then by Lemma 3.10 the matching distribution is -good. By LABEL:lem:glob:div:bound:drift we have for all ,
[TABLE]
To bound we use the following two claims (see Section B.4 for the proof).
Claim 3.13*.*
It holds that .
Claim 3.14*.*
For any -regular graph it holds that .
Together we get from 3.13 and 3.14 that with probability at least
[TABLE]
Since (Proposition 10.16 in [32]), , and ,
[TABLE]
Now Lemma 3.6 states that for any fixed sequence of matching matrices , with probability at least it holds that
[TABLE]
Applying a union bound over all , Eq. 4 and Eq. 5 hold for all with probability at least . Hence, for all
[TABLE]
The high-probability bound now follows from 3.5. The corresponding bound on follows readily; see Lemma A.7 in Section A.2 for the details. ∎
4 Balancing Circuit Model
Here we assume . Recall that we assume is covered by fixed matchings . The matching distribution then deterministically chooses the matching in step . The round matrix is defined as and the mixing matrices are fixed in this model. Thus, for a sequence of matchings the global divergence is \Upsilon(\mathbf{m}^{[t]})\coloneqq\max_{k\in[n]}\sqrt{\sum_{\tau=1}^{t}\mathopen{}\mathclose{{}\left\lVert\mathbf{m}^{[\tau,t]}_{k,\cdot}-1/n}\right\rVert_{2}^{2}}. The next theorem provides an upper bound on the discrepancy for this model. Note that the following theorem holds for arbitrary graphs, while Theorem 3.1 only holds for -regular graphs.
Theorem 4.1**.**
Let be an arbitrary graph and be the state of process at time with . For all with t\geq\frac{\zeta}{\operatorname{\lambda}{(\mathbf{R})}}\cdot\mathopen{}\mathclose{{}\left(\ln(K\cdot n)}\right) it holds w.h.p. and in expectation
[TABLE]
Proof.
The proof follows the same line as the proof Theorem 3.1, which is proved via LABEL:lem:initial:load:vanishes, Lemma 3.4, and LABEL:lem:rounding:errors:are:small bounding , and , respectively. LABEL:lem:initial:load:vanishes is replaced by Lemma 4.2 below. LABEL:lem:initial:load:vanishes can also be applied to the balancing circuit model since it only requires that the subgraph used for balancing is a matching.
It remains to replace LABEL:lem:rounding:errors:are:small. Since the matching matrices are fixed this time the proof is much simpler. The proof of Lemma 3.6 carries to over to this model giving us a bound on for with probability at least . Applying the union bound over all nodes , together with 3.5 (stating that ), gives a bound on which holds with probability at least . ∎
Lemma 4.2** (Memorylessness Property).**
For all with t\geq{\zeta}/{\operatorname{\lambda}{(\mathbf{R})}}\cdot\mathopen{}\mathclose{{}\left(\ln(K\cdot n)}\right) it holds that .
Proof.
Since it follows from Lemma 2 in [26] that
[TABLE]
Setting t\geq(\zeta/\operatorname{\lambda}{(\mathbf{R})})\cdot\mathopen{}\mathclose{{}\left(\ln(Kn)}\right) gives \operatorname{\Phi}\mathopen{}\mathclose{{}\left(\mathbf{m}^{[1,t]}\cdot\vec{x}}\right)\leq 1 which implies that . ∎
Note that a similar statement was shown in [43, 44, 8].
The next theorem provides a lower bound on the discrepancy for this model. The proof can be found in Appendix C.
Theorem 4.3**.**
Let be an arbitrary graph and be the state of process at time . Then for all and it holds with constant probability
[TABLE]
5 Asynchronous Model
The following is our main theorem for the asynchronous model. The bounds provided by Theorem 5.1 for the asynchronous model differ from those in Theorem 3.1 for the random matching model in two details. First, the lower bound on the balancing time is larger by a factor of . This is due to the fact that the asynchronous model balances across just one edge per round in contrast to edges in the random matching model. Second, the upper bound on is much simpler. Note, however that setting in Theorem 3.1 and further simplifying the result by using (see also 3.14 in the proof of Lemma 3.4) results in the same asymptotic bound as in Theorem 5.1.
Theorem 5.1**.**
Let be a -regular graph and define (T(G)\coloneqq\min\Big{\{}\frac{\operatorname{t_{\mathrm{hit}}(G)}}{n}\cdot\log(n),\sqrt{\frac{d}{\operatorname{\lambda}(\operatorname{\mathbf{L}}(G))}},\frac{1}{\operatorname{\lambda}(\operatorname{\mathbf{L}}(G))}\Big{\}}. Let be the state of process at time with . There exists a constant such that for all it holds w.h.p. and in expectation
[TABLE]
Proof Sketch of Theorem 5.1.
The proof of the theorem follows along the same lines at the proof of Theorem 3.1. However, there are some major differences. Most importantly, the proof of Lemma 3.6 (giving a concentration bound on in terms of the global divergence of the sequence of matching matrices) can not be applied for ABal. The proof heavily relies on the fact that the load allocation and the matching edges are chosen independently from each other, which is certainly not the case for ABal. Our new lemma (Lemma D.1 in Appendix D) carefully analyses the dependency, and it uses a stronger concentration inequality. In addition, we also have to re-calculate the function and to show that the matching distribution used by is -good (see Lemma D.2 in Appendix D). ∎
6 Drift Result
In our analysis we use the following tail bound for the sum of a non-increasing sequence of random variables with variable negative drift. The proof uses established methods from drift analysis. In particular, it relies one techniques found in the proof of the Variable Drift Theorem in [31]. The full technical proof can be found in Appendix E.
Theorem 6.1** (name=,restate=restateLemDrift,label=lem:drift).**
Let be a non-increasing sequence of discrete random variables with for all with fixed . Assume there exists an increasing function and a constant such that the following holds. For all and all with
** 2. 2.
{\operatorname{\mathrm{Var}}[X(t+1)\mid X(t)=x]}\leq\sigma\cdot\mathopen{}\mathclose{{}\left({\operatorname{\mathbb{E}}[X(t+1)\mid X(t)=x]}-x}\right)^{2}.**
Then the following statements hold.
For all and any arbitrary but fixed
[TABLE] 2. 2.
For all and we define t_{0}\coloneqq\frac{2(\sigma+1)}{\delta^{2}}\mathopen{}\mathclose{{}\left(-\log(p)+\log\mathopen{}\mathclose{{}\left(\frac{2(\sigma+1)}{\delta^{2}}}\right)}\right). Then
[TABLE]
7 Conclusions and Open Problems
In this paper we analyze discrete load balancing processes on graphs. As our main contribution we bound the discrepancy that arises in dynamic load balancing in three models, the random matching model, the balancing circuit model, and the asynchronous model. Our results for the random matching model and the asynchronous model hold for -regular graphs, while our analysis for the balancing circuit model applies to arbitrary graphs.
To the best of our knowledge our results constitute the first bounds for discrete, dynamic balancing processes on graphs. Furthermore, our results improve the work by Alistarh et al. [4] who prove that the expected discrepancy is bounded by in the (arguably simpler) continuous asynchronous process . We improve their bound to and additionally show that it holds with high probability. We conjecture that our results are tight up to polylogarithmic factors. However, showing tight upper and lower bounds remains an open problem.
Results for Specific Graph Classes
We show an overview of our bounds on the discrepancy for specific graph classes in Table 1. The corresponding results are formally derived in Section B.5 for the random matching model, Section C.1 for the balancing circuit model, and Section D.1 for the asynchronous model.
Open Problems
We are confident that our results carry over to arbitrary graphs (as opposed to regular graphs), provided that there exists a lower bound on the probability with which an edge is used for balancing. However, to show bounds on the discrepancy one has to overcome fundamental problems such as the bias introduced by high-degree nodes. Another interesting open question is whether the results carry over to a model where the amount of load that may transmitted over an edge in each step is bounded by a constant. If only a single load item can be transferred per edge and step the problem is similar to the token distribution problem (see, for example, [7]).
Finally, we believe that one can also adapt our analysis to variant of a graphical balls-into-bins process. The process works as follows. In each step an edge is sampled uniformly at random. W.l.o.g. assume that the load of is smaller than the load of by an additive term . Then a biased coin is tossed showing heads with probability and tails otherwise, where is a suitably chosen and non-constant parameter. If the coin hits heads one item is allocated to and otherwise to . A formal analysis of this allocation process (as well as of other, related balls-into-bins processes) is beyond the scope of our paper and remains an open problem.
Appendix A Auxiliary Results
A.1 Random Walks, Hitting Times, and Effective Resistance
In this appendix we present for completeness fundamental definitions and relations concerning random walks, hitting times, and the effective resistance. We start with a definition of the effective resistance of a network in Definition A.1. For a motivation of the definition see [32, Chapter 9]. Further details and properties can also be found in [19] and [34, Section 4].
Theorem A.1** (Harmonic Functions and Effective Resistance).**
Let be a graph and let be nodes of the graph. Then a harmonic function on with the poles and (for unit edge weights) is a function such that for all we have , where is the set of ’s neighbors in .
Given a harmonic function on with the poles and (with arbitrary boundary values ), the effective resistance (or resistive distance between and in is given by
[TABLE]
Note that the value is not dependent on the boundary values of the harmonic function.
Note that for boundary values and the harmonic function is unique [32, Proposition 9.1].
The following is a well-known property of effective resistances; it is a direct consequence of, e.g., Corollary 9.13 in [32].
Lemma A.2**.**
Let be a graph, and write for the (standard) distance between and in . Then .
For a graph , and nodes , let be the hitting time from to , i.e., the expected time for a random walk on starting at to reach for the first time.
Theorem A.3** (Theorem 4.1 (i) in [34]).**
Let be a graph. Then for any ,
[TABLE]
Corollary A.4**.**
Let be a graph. Then for any ,
[TABLE]
Proof.
For the first inequality, since one of and is at least the maximum of the two, we have, by Theorem A.3:
[TABLE]
And for the second inequality, since both and are at most the maximum of the two, we have, again by Theorem A.3
[TABLE]
as claimed. ∎
Theorem A.5** (Dirichlet’s principle, see Exercise 2.13 in [35]; or Exercise 9.9 in [32], referencing Theorem 6.1 in [33]).**
Let be distinct nodes of a graph . Then
[TABLE]
Theorem A.6** (Corollary 3.3 in [34], applied to -regular graphs).**
Let be an arbitrary graph on nodes. Then
[TABLE]
A.2 Tail Bounds
The following lemma allows us to turn a high-probability bound into a bound on the expected value. We consider this result folklore. For completeness we give a formal proof below.
Lemma A.7**.**
Let be a non-negative real random variable, and let . Then if there are such that for all ,
[TABLE]
then
[TABLE]
Proof.
Observe that when we have , so that for all we have
[TABLE]
Thus,
[TABLE]
as claimed. ∎
Theorem A.8** (Bhatia-Davis inequality [14]).**
Let be a real random variable with . Then
Theorem A.9** (Azuma–Hoeffding inequality Theorem 13.6 in [38]).**
Let be a martingale associated with the filter , where there exist non-negative sequences , and such that for all ,
[TABLE]
Then for all ,
[TABLE]
Theorem A.10** (Adapted from Theorem 6.6 in [17]).**
Let be a martingale associated with the filter , where there exist and such that for all ,
; 2. 2.
.
Then for all ,
[TABLE]
Theorem A.11** (Adapted from Theorem 2.1 and combined with Remark 2.1 and Equation 18 in [21]).**
Let be a supermartingale associated with the filter , where for all . Let be the quadratic characteristic of , i.e., let
[TABLE]
Then, for any and ,
[TABLE]
Corollary A.12**.**
Let be a martingale associated with the filter , where for all . Then with as in Theorem A.11, for any and ,
[TABLE]
Proof.
As is a martingale, it is also a supermartingale, and it fulfills the conditions of Theorem A.11 by the assumptions of the claim. So way may use Theorem A.11 to see that
[TABLE]
As this implies that
[TABLE]
The claim follows from applying the same argument to the supermartingale and a union bound. ∎
Theorem A.13** (Berry-Esseen Theorem [13, 20] for Non-identical Random Variables).**
Let be independently distributed with , and . If is the distribution of and is the standard normal distribution, then
[TABLE]
where \psi_{0}=\frac{\sum_{i=1}^{k}\rho_{i}}{\mathopen{}\mathclose{{}\left(\sum_{i=1}^{k}\sigma_{i}^{2}}\right)^{3/2}} and is a constant.
Theorem A.14** (Theorem 3.4 of [17], [36]).**
let () be independent random variables satisfying , for . We consider the sum with expectation and variance . Then we have
[TABLE]
Theorem A.15** (Theorem 4.1 of [17]).**
Let denote independent random variable satisfying for . For we have
[TABLE]
Appendix B Omitted Proofs from Section 3
In this appendix we present the omitted proofs from Section 3. We first formally prove that the discrepancy is sub-additive.
Observation B.1**.**
For two vectors ,
[TABLE]
Proof.
For any ,
[TABLE]
and thus
[TABLE]
as claimed. ∎
B.1 Proof of LABEL:lem:initial:load:vanishes
\restateInitialLoadVanishes
Proof.
To bound , we use the following claim:
Claim*.*
If , then and if , then
First, note that by definition of . Hence, . By the claim, if , then with probability at least , and hence . Also by the claim, if , then , and then by Jensen’s inequality,
[TABLE]
Proof of the claim.
We aim to use the first statement of LABEL:lem:drift on and therefore need to check its preconditions. By the definition of , for all ,
[TABLE]
Entirely analogous to the calculations in the proof of LABEL:lem:glob:div:bound:drift (Eqs. 9 and 10), we have, writing (so that ),
[TABLE]
and from the latter it immediately follows that for all
[TABLE]
Combining the first statement of LABEL:lem:edge_potential_bounds and the first statement of LABEL:prop:node_potential_change_statistics gives us, for all ,
[TABLE]
so that, for all ,
[TABLE]
By the second statement of LABEL:prop:node_potential_change_statistics, for all :
[TABLE]
And so,
[TABLE]
So we can now apply LABEL:lem:drift with
[TABLE]
With these values and , the first statement of LABEL:lem:drift gives us
[TABLE]
The integral evaluates to
[TABLE]
This is at least if and only if
[TABLE]
which follows after rearranging the initial inequality and exponentiation. So
[TABLE]
Now, let . Then in particular, , so that . Furthermore, it is the case that (by Theorem A.6) and that .
Therefore, there is a sufficiently large constant such that if , then
[TABLE]
as well as
[TABLE]
From , it follows that
[TABLE]
From , it follows that
[TABLE]
And so, for , Eq. 7 entails
[TABLE]
which is the remaining claim for the high-probability statement.
For the remaining claim (i.e., the statement concerning the expectation), note that for the calculations above and Eq. 7 entail that
[TABLE]
Hence, as for all , we have, for all ,
[TABLE]
as claimed. ∎
This concludes the proof of the lemma. ∎
B.2 Proof of LABEL:lem:rounding:errors:are:small
\restateRoundingErrorsAreSmall
The proof is similar to the proof of [44, Theorem 3.4].
Proof.
We show the concentration bound on by proving concentration bounds on the absolute values for each and then applying a union bound over all . To show the concentration bound on holds for any fixed sequence of matchings ; this implies a concentration bound on a random sequence of matchings by the law of total probability.
So we fix . Recall that
[TABLE]
where is the vector of additive rounding errors incurred in round : it is the difference between the load vector step , and what the load vector would be after step if the balancing in this step were idealized. This additive rounding error stems from the constraint that only whole items can be transferred across the edges of the matching at time . From the description of the protocol, it is immediate that the rounding errors at matched nodes sum to [math], so that for all edges matched in round . Thus,
[TABLE]
We will derive the claimed tail bound on by applying the Azuma-Hoeffding inequality (Theorem A.9) to a sequence of partial sums as follows. We sequence the rounding actions with increasing and arbitrarily within rounds. If is the representative node of the th edge in round (with and ), for let us write
[TABLE]
and let if there are fewer than edges are in the matching in round . Se sequence of partial sums is then , which we consider with respect to the filtration in which completely determines the state right before the rounding action corresponding to the term . Note that . To apply Theorem A.9, it is enough to show that the conditional expectation of the difference between successive terms is zero, and that we can bound the differences between terms.
To check these preconditions, let us write for the fractional value of the load at node before the rounding action (i.e., the fractional value of the load if balancing were idealized and no rounding was necessary). Then the load will be rounded up with probability , resulting in a positive rounding error of , or rounded down with probability , resulting in a negative rounding error of . Hence,
[TABLE]
so that, as required,
[TABLE]
From this description, it is also clear that writing , the term is bounded from above by , and from below by , so that .
So we may apply Theorem A.9; to use it we require (an upper bound on) the value of the sum , which we bound by applying LABEL:obs:node_potential_change_exact and collapsing the ensuing telescoping sum (analogously to the proof of Theorem 3.2 in [44]):
[TABLE]
where follows from the fact that and therefore, . So by Theorem A.9 (with and ) we have
[TABLE]
Since , applying a union bound over all nodes we see that
[TABLE]
which is the claimed concentration bound.
To show the bound on , we apply Lemma A.7 with , and to see that,
[TABLE]
B.3 Omitted Proofs from Section 3.1
\restateObsPotentialRelation
Proof.
We assume w.l.o.g. that the entries of sum to [math], meaning that , so that . As loads only change at matched nodes, let us investigate the potential change at two matched nodes and , where w.l.o.g. . The amount of load transferred from to under idealized balancing (without rounding) is . So with
[TABLE]
the loads before balancing are and , and the loads after idealized balancing are and . So the change of the potential contributions at and is
[TABLE]
where we used . Now,
[TABLE]
Summing this over all edges in the matching gives, as claimed,
[TABLE]
\restateLemGlobalDivergence
Proof.
First recall that
[TABLE]
As the mixing matrices are doubly stochastic, each row is a stochastic vector . By definition of the node potential we know
[TABLE]
and hence
[TABLE]
To bound this sum we will apply the second statement of LABEL:lem:drift to the sequence of values for . Since the matching matrices are symmetric we get
[TABLE]
By LABEL:obs:node_potential_change_exact with defined as the edges of we get
[TABLE]
This shows that for all . Expressing Eq. 8 with Balancing Parameter and, for the ease of presentation, setting gives us
[TABLE]
Since for we get
[TABLE]
As is -good, for any stochastic vector we have Combining this with Eq. 9 gives
[TABLE]
And thus,
[TABLE]
Similarly, as is -good, for any stochastic vector we have
{\operatorname{\mathrm{Var}}[\operatorname{\Phi}(\mathbf{M}^{1}\cdot\vec{v})]}\leq(\sigma^{2}-1)\cdot\mathopen{}\mathclose{{}\left(\operatorname{\Phi}(\vec{v})-{\operatorname{\mathbb{E}}[\operatorname{\Phi}(\mathbf{M}^{1}\cdot\vec{v})]}}\right)^{2}. Combining this with Eq. 10 gives us
[TABLE]
and thus
[TABLE]
We apply the second statement of LABEL:lem:drift with , , and , which is an increasing function as is increasing by the definition of -good, and get
[TABLE]
where . From this follows that with probability at least
[TABLE]
where follows from the fact that for -th row of any stochastic matrix . The lemma follows applying the definition of . ∎
\restateLemNodePotentialChangeStatistics
Proof.
By LABEL:obs:node_potential_change_exact, we have
[TABLE]
Rearranging this lower bound into
[TABLE]
and expanding the definition of we have by linearity of expectation
[TABLE]
where the inequality used that, for and all edges , it holds that [25, Lemma 2]. It finishes the proof of the first statement.
For the second statement observe that by LABEL:obs:node_potential_change_exact we have
[TABLE]
Then, as is constant for a given ,
[TABLE]
Recall that the matching distribution is obtained as follows. First, generate a random edge set as follows. For each , with probability , independently of all other edges. Then, some edges of are deleted to create a proper matching, resulting in . Hence
[TABLE]
and
[TABLE]
Observe that can be expressed as with . Thus,
[TABLE]
By using LABEL:lem:edge_potential_bounds(3) and then LABEL:claim:hitting_time_resistance_relation(1) we get that
[TABLE]
Hence,
[TABLE]
Applying the first statement of this lemma we get
[TABLE]
Putting everything together the second statement follows from
[TABLE]
\restateEdgePotentialBounds
Proof.
First note that for all , , and ,
[TABLE]
The proof of the first part is similar to that of Theorem 2.6 in [44]. First, see that
[TABLE]
As by Eq. 15, we may assume w.l.o.g. that by subtracting from every coordinate of . For such a vector we have , and
[TABLE]
where the final equality is due to the min-max theorem and the fact that the smallest eigenvalue of is [math], with its associated eigenvector being .
For the second part, let be two distinct nodes of the graph with . Then
[TABLE]
where the first equality uses Eq. 15, the central inequality holds because the argument of is a vector with and , and the final equality is by Dirichlet’s principle (Theorem A.5). Note that the bound also holds when .
Given Eq. 16, we now show that is larger than the first, resp. second, term inside the maximum of the second part’s statement. For the first term, we choose and such that , and recall that for all . Then, Eq. 16 states that and it remains to bound from below by . To that end, as the vector is stochastic by assumption, the sum over all its entries is 1, and there is at least one with . Hence, , and so
[TABLE]
as needed to complete the bound for the first term.
For the second term, we choose and such that , with the distance between and being minimal. As , each of the entries of for the non-terminal nodes on a shortest path between and is at least . As is stochastic by assumption, the sum of all loads is at most , and we have
[TABLE]
which implies Since is bounded by the standard distance between and (see Lemma A.2), and we thus have, by Eq. 16,
[TABLE]
where the final inequality uses as shown above.
For the third statement we first rearrange Eq. 16 to see that, for all ,
[TABLE]
Taking the maximum over all on both sides gives us
[TABLE]
as claimed, where the final equality is by definition of . ∎
The following lemma is well-known, we state it for completeness. It relates the hitting time of a graph to its resistive diameter and the edge hitting time of to the .
Lemma B.2** (label=claim:hitting_time_resistance_relation,restate=restateHittingTimeResistanceRelation).**
For any graph
, and 2. 2.
.
Proof.
Recall that
[TABLE]
and that
[TABLE]
For the first inequality, let be adjacent nodes for which . Then, by Corollary A.4,
[TABLE]
which becomes the first inequality after dividing by 2 on both sides. For the second inequality, let be adjacent nodes for which . Then, again by Corollary A.4,
[TABLE]
The second statement is entirely analogous, except that the are no longer required to be adjacent, and that they are chosen such that for the first inequality, or, for the second inequality, that . ∎
B.4 Omitted Details from the Proof of Lemma 3.4
Proof of 3.13.
First, expanding the definition of , pulling out constant factors, and simplifying fractions results in
[TABLE]
and we write , , and for the first, second, and third argument of the minimum. For , the indefinite integrals of these functions are
[TABLE]
First, we show that : As , we bound the integral in question as
[TABLE]
Next, we show that : Let be the such that . If , then
[TABLE]
But if , the same bound also holds: we showed above that the integral in question is bounded by , so that if , we have an upper bound of
[TABLE]
Last, we show that : Let be the such that . If , then
[TABLE]
where the penultimate bound uses the fact that (LABEL:claim:hitting_time_resistance_relation), and the final bound uses the fact that the inverse spectral gap of the normalized Laplacian is bounded from above by (cf. [3]), and that , so that the argument of the logarithm is polynomial in .
Otherwise, if , the same bound also holds: we show above that the integral is bounded by , so that if we have an upper bound of
[TABLE]
Combining the three bounds, we have, as claimed,
[TABLE]
Proof of 3.14.
By the first inequality of Corollary 3.3 in [34] it holds for any nodes that
[TABLE]
As is regular we have and , and since the statement holds in particular for any pair of nodes that is adjacent this entails
[TABLE]
and the claim follows. ∎
B.5 Bounds for Specific Graph Classes
In this appendix we show bounds on the discrepancy for specific graph classes. Note that we assume that initially the system is empty.
Corollary B.3**.**
Let be the state of process where . For an arbitrary it holds w.h.p. and in expectation
- •
* for any regular graph.*
- •
* for cycles and constant-degree regular graphs.*
- •
* for the two-dimensional torus graphs.*
- •
* for torus graphs with dimensions, the hypercube, and all -regular graphs with .*
To show the above corollary we require bounds on (Lemma B.4) and bounds on (Lemma B.6). Then the corollary immediately follows from Theorem 3.1.
In the following lemma we provide some bounds on for several specific graph classes.
Lemma B.4**.**
Assume is a graph with nodes.
- •
For constant-degree regular graphs we have .
- •
For a two-dimensional toroidal mesh we have .
- •
For a -dimensional toroidal mesh (with ) we have .
- •
For a -dimensional hypercube we have .
- •
For a -regular graph with we have .
- •
For an arbitrary -regular graph we have .
Proof.
Recall that T(G)=\min\mathopen{}\mathclose{{}\left\{1/\operatorname{\lambda}(\operatorname{\mathbf{L}}(G)),\sqrt{d/\operatorname{\lambda}(\operatorname{\mathbf{L}}(G))},(\operatorname{t_{\mathrm{hit}}(G)}/n)\cdot\log(n)}\right\}, and that (LABEL:claim:hitting_time_resistance_relation), so that .
For -regular graphs with being constant, by [30], where diameter of . As and is constant, , so that .
For the two-dimensional toroidal mesh, and by [16, Theorem 6.1], so that .
For a -dimensional toroidal mesh with , as well as the -dimensional hypercube, and by [16, Theorem 6.1], so that .
For a -regular graph with , by [16, Theorem 3.3], so that .
For general -regular graphs , by [32, Proposition 10.16], so that . ∎
To bound for many specific graph classes we use the following.
Theorem B.5** (Theorem 2.10 of [34], citing [28]).**
Let be a graph and be one of its nodes. Then if is chosen uniformly at random from the neighbors of in , , where is the degree of in .
This gives us the following bounds.
Lemma B.6**.**
Assume is a graph with nodes.
- •
For being a toroidal mesh (including cycles and hypercubes), or being a -regular graph with , we have
- •
For an arbitrary -regular graph we have .
Proof.
Recall that . Toroidal meshes are symmetric or arc-transitive graphs: for every two ordered pairs of adjacent nodes and there is a graph automorphism such that and . Hence, for every such two ordered pairs, , and thus for any pair of adjacent nodes . So applying Theorem B.5 shows that . As for -regular graphs, as claimed.
For dense graphs we bound as (see LABEL:claim:hitting_time_resistance_relation). As by [16, Theorem 3.3], we get since that .
For arbitrary -regular graphs, by the first statement of LABEL:claim:hitting_time_resistance_relation. As for a -regular graph, and as (by definition of and Lemma A.2), we thus have . ∎
Appendix C Balancing Circuit Model
In this appendix we prove Theorem 4.3. The proof is similar to Theorem 1.2 in [15].
Proof of Theorem 4.3.
First we show a lower bound on . The idea is to decompose into sum of independent random variable which have expected value zero. It then remains to show that \sum_{\ell}\operatorname{\mathbb{E}}\mathopen{}\mathclose{{}\left[\mathopen{}\mathclose{{}\left|Y_{\ell}^{3}}\right|}\right] is properly bounded. It allows us to apply a concentration inequality to the sum. To do so, we define several intermediate random variables similar to the proof of Lemma 3.6.
Fix round and consider node such that . Recall that,
[TABLE]
We define indicator random variables for , and as follows.
[TABLE]
Note that for fixed and , and {\operatorname{\mathbb{P}}}\mathopen{}\mathclose{{}\left[{B}({\tau,j,w})=1}\right]=1/n. Recall that can be expressed as . It then follows that
[TABLE]
We define the derivative from the average for as
[TABLE]
It immediately follows that We call
[TABLE]
the contribution of the -th load item (of step ) to . For a fixed and , from the linearity of expectation, it follows that
[TABLE]
where the last inequality follows since is a doubly stochastic matrix.
Here for such that and we define and it follows . Note that ’s are independent. We want to apply the Berry-Esseen Theorem [13, 20] (see Theorem A.13 in Section A.2). To do so, we need to compute and . Then we get
[TABLE]
where in the second last equality we used the fact that for each and each exactly one of the is one and all others are zero, and that each of the possible cases has uniform probability. Similarly we have
[TABLE]
where follows form the law to total expectation, from the fact that for any , .
Recall that . By defining as the distribution of , from Theorem A.13 it follows that,
[TABLE]
in which the last inequality follows from the assumption, , and is some constant. Note that is the standard normal distribution. Therefore it holds that,
[TABLE]
where the last inequality follows from [[1], Formula 7.1.13] which states
[TABLE]
Hence with we have
[TABLE]
Therefore by replacing the definition of we get that
[TABLE]
Recall that \widetilde{D}_{k}(t)=D_{k}(t)-\operatorname{\mathbb{E}}\mathopen{}\mathclose{{}\left[D_{k}(t)}\right], then it follows that
[TABLE]
Moreover, when node receives more than expectation from the allocated load items, there is (at least) one node receiving less than expectation. Hence,
[TABLE]
Since , then . From LABEL:lem:rounding:errors:are:small it follows that with probability . Since and , then it follows
[TABLE]
∎
Theorem 4.3 states that for a sequence of matchings as long as , then the load derivation of node from the expectation at round normalized by its standard deviation follows a standard normal distributed variable.
C.1 Bounds for Specific Graph Classes
In the following we drive some bounds on the discrepancy for specific graph classes. Note that we assume that initially the system is empty. The first corollary gives some upper bounds and the second one lower bounds. Corollary C.1 and Corollary C.2 are summarized in Table 1 (in Section 7) and Table 2 (below), respectively.
Corollary C.1**.**
Let be the state of process at time with and assume has nodes. For an arbitrary it holds w.h.p. and in expectation
- •
* for arbitrary graphs with round matrix .*
- •
* for cycle and regular graphs with constant .*
- •
* for the two-dimensional torus or hypercube graphs.*
- •
* for constant three or more-dimensional torus.*
Proof.
The bounds follow from a straight-forward combination of the upper bounds on the local divergence from Lemma C.3 with Theorem 4.1. ∎
Corollary C.2**.**
Let be the state of process at time with . It holds with constant probability that
- •
\operatorname{\mathrm{disc}}(\vec{X}(t))=\Omega\mathopen{}\mathclose{{}\left(\sqrt{m}}\right), for cycle, constant -regular graphs, and .
- •
\operatorname{\mathrm{disc}}(\vec{X}(t))=\Omega\mathopen{}\mathclose{{}\left(\sqrt{\frac{m}{n}\cdot\log(n)}}\right)* for two-dimensional torus, , and .*
- •
\operatorname{\mathrm{disc}}(\vec{X}(t))=\Omega\mathopen{}\mathclose{{}\left(\sqrt{\frac{m}{n}}}\right), for constant -dimensional torus, hypercube graphs, , and .
Proof.
The bounds follow from a straight-forward combination of the bounds on the local divergence from Lemma C.3 to Theorem 4.3. ∎
The two corollaries above show that our bounds are almost tight for cycle graphs, constant -regular graphs, -dimensional torus graphs with constant and hypercube graphs. For instance, consider a cycle constructed by Odd-Even scheme and assume . Corollary C.1 states that the discrepancy is, w.h.p., while Corollary C.2 implies that, with constant probability, the discrepancy is .
We now compute the global divergence for following concrete graphs and circuits: For cycles of even length, we consider the “Odd-Even” scheme in which the first matching consists of all edges for any odd , and the second matching consists of all edges for any even . More generally, for -dimensional torus with node set , the balancing circuit consists of matchings in total, two matchings for each dimension , analogously to the cycle. For the hypercube, the canonical choice is the dimension exchange circuit consisting of matchings, where nodes and are matched in if and only if their binary representations differ in bit only (see, e.g., [15]).
Recall that and . The next lemma is about the global divergence of some specific graphs for the distribution .
Lemma C.3** (Global Divergence).**
Let be a graph and consider constructed by Odd-Even scheme such that it produces the round matrix .
For each it holds (\Upsilon(\mathbf{M}^{[t]}))^{2}={\operatorname{O}}\mathopen{}\mathclose{{}\left(\zeta/\operatorname{\lambda}(\mathbf{R})}\right). 2. 2.
For a constant and each it holds (\Upsilon(\mathbf{M}^{[t]}))^{2}={\operatorname{O}}\mathopen{}\mathclose{{}\left(n}\right). It also holds for any , . 3. 3.
For two-dimensional torus and for each it holds (\Upsilon(\mathbf{M}^{[t]}))^{2}={\operatorname{O}}\mathopen{}\mathclose{{}\left(\log(n)}\right). It also holds for any , . 4. 4.
For constant -dimensional torus and each it holds (\Upsilon(\mathbf{M}^{[t]}))^{2}={\operatorname{O}}\mathopen{}\mathclose{{}\left(r}\right). It also holds for any , . 5. 5.
For hypercube graphs and each it holds (\Upsilon(\mathbf{M}^{[t]}))^{2}={\operatorname{O}}\mathopen{}\mathclose{{}\left(\log(n)}\right). It also holds for any , .
Proof.
Recall that the sequence of matching matrices has global divergence , if
[TABLE]
Since the matchings are fixed we have \mathopen{}\mathclose{{}\left(\Upsilon(\mathbf{m}^{[t]})}\right)^{2}=\max_{w\in[n]}\sum_{\tau=1}^{t}\|\mathbf{m}^{[\tau,t]}_{w,\cdot}-\frac{\vec{1}}{n}\|_{2}^{2}. Consider a node such that . We have seen that
[TABLE]
Since is non increasing in and , then
[TABLE]
Hence, to bound \mathopen{}\mathclose{{}\left(\Upsilon(\mathbf{m}^{[t]})}\right)^{2}, it is enough to bound .
General case:
Here we get,
[TABLE]
where follows from [26, Lemma 2]. Note that .
Cycles:
Recall that in cycle . It holds that
[TABLE]
where and follows [15], from [26, Lemma 2]. To see , consider that the spectral gap of the round matrix corresponding to a cycle is [43]. Moreover, for with some constant , it follows from [15] that
[TABLE]
for .
Two-dimensional torus:
Note that in -dimensional torus graphs , and the spectral gap of the round matrix corresponding to a -dimensional torus is [43]. Hence,
[TABLE]
where and follow from [15], from [26, Lemma 2]. Moreover, for with some constant , it follows from [15] that
[TABLE]
for .
Constant three or more-dimensional torus:
Let us assume for some then
[TABLE]
where follows form [15].
Hypercubes:
Similarly, it holds that
[TABLE]
where follows from [15]. Recall that in hypercube .
The lower bound of is trivial. ∎
Appendix D Asynchronous Model
The following is the equivalent of Lemma 3.6 for the process ABal:
Lemma D.1**.**
Let be a regular graph, and let . Then in , for all , , and for such that , we have
[TABLE]
Proof.
Let be the vector of allocated loads in round and recall that we have
[TABLE]
Using , we can express the th coordinate of as
[TABLE]
is the contribution of the load item allocated in round to . Note that in the second factorization of the , the two factors are independent as they concern disjoint rounds.
Now consider the sequence of partial sums with respect to the natural filtration on the sequence of edges . In particular, we have
[TABLE]
and determines all edges used in rounds up to round . To apply the martingale tail inequality Corollary A.12 to , we need to check that and that .
For the first condition, note that both and are stochastic vectors (for the latter, this is because exactly one load item is allocated in each round in the asynchronous model). Thus, their inner product has a value in the interval so that , as required.
For the second condition, note that
[TABLE]
so that it is enough to show that the expected value of the is when conditioned on the matching choices in rounds to . The bound given by Corollary A.12 also involves the quantity
[TABLE]
so we will investigate more thoroughly than would be required to compute only its conditional expectation.
To this end, let us first make the dependence between and more explicit. Let be the random orientation of the random edge selected in round , so that the load item in round is allocated to , and then the load is balanced across the edge . Then
[TABLE]
Using this, we may see that
[TABLE]
Now is the uniform distribution over the edges of , and the node to which load is allocated is a uniformly random endpoint of the chosen edge. Thus, is distributed uniformly over the oriented edges . Since is -regular, there are such oriented edges. Hence, for all ,
[TABLE]
By an entirely analogous calculation, holds as well. So and are identically distributed (but not necessarily independent). Because of this, the two sums over on the right-hand side of Eq. 17 are also identically distributed.
We can now compute the conditional expectation of . Using Eq. 17 and linearity of expectation we see that
[TABLE]
So as required for applying Corollary A.12.
So all preconditions of Corollary A.12 hold. Applying it with and yields
[TABLE]
We will now show that , which finishes the proof after noting that then,
[TABLE]
with the last inequality using the condition on in the statement.
So to bound , recall that
[TABLE]
with the latter equality using the fact the expected value of conditioned on is [math]. And since and is a constant,
[TABLE]
By Eq. 17, and as for two identically distributed random variables and , and , we have :
[TABLE]
And hence we may bound from above using the global divergence:
[TABLE]
which is all that remained to be shown. ∎
The next result is the analogue of Lemma 3.10:
Lemma D.2**.**
Assume is an arbitrary -regular graph. Then is -good, where
[TABLE]
The proof of Lemma D.2 is analogous to that of Lemma 3.10, except that we use Lemma D.3 stated below instead of LABEL:prop:node_potential_change_statistics.
Lemma D.3**.**
Let be a -regular graph, let , and let , Then
\operatorname{\Phi}(\vec{x})-{\operatorname{\mathbb{E}}\mathopen{}\mathclose{{}\left[\operatorname{\Phi}(\mathbf{M}^{1}\cdot\vec{x})}\right]}=\frac{1}{dn}\cdot\operatorname{\Psi}_{G}(\vec{x}).** 2. 2.
{\operatorname{\mathrm{Var}}\mathopen{}\mathclose{{}\left[\operatorname{\Phi}(\mathbf{M}^{1}\cdot\vec{x})}\right]}\leq(2\cdot\operatorname{t^{*}_{\mathrm{hit}}(G)}-1)\cdot\mathopen{}\mathclose{{}\left(\operatorname{\Phi}(\vec{x})-{\operatorname{\mathbb{E}}\mathopen{}\mathclose{{}\left[\operatorname{\Phi}(\mathbf{M}^{1}\cdot\vec{x})}\right]}}\right)^{2}.**
Proof.
For the first statement, we use LABEL:obs:node_potential_change_exact as well as the fact that is the uniform distribution over the edges of to see that, as claimed.
[TABLE]
For the second statement we first observe that is constant and by LABEL:obs:node_potential_change_exact we have
[TABLE]
We bound this variance using the Bhatia-Davis inequality (see Theorem A.8 in Section A.2). It states that, for a random variable taking values in , and with , it is the case that Now from the definition of , it is immediate that . For the upper bound on , recall that the matchings consist of just one edge, and so . The latter is bounded from above by the third statement of LABEL:lem:edge_potential_bounds, yielding
[TABLE]
And so, by the Bhatia-Davis inequality (Theorem A.8),
[TABLE]
where the last inequality used the fact that by LABEL:claim:hitting_time_resistance_relation. ∎
D.1 Bounds for Specific Graph Classes
Again as in Section B.5 we consider specific graph classes and use the bounds on and on the hitting time from Section B.5. When applied to Theorem 5.1 we get the following results w.h.p. and in expectation.
Corollary D.4**.**
Let be the state of process where . For an arbitrary it holds w.h.p. and in expectation
- •
* for any regular graph.*
- •
* for cycle and constant-degree regular graphs.*
- •
* for the two-dimensional torus graph.*
- •
* for -dimensional torus graphs with dimensions, for the hypercube, and for all -regular graphs with .*
Appendix E Proof of the Drift Result
In this appendix we give the full proof of our drift result from Section 6. We restate it for convenience.
\restateLemDrift
Proof.
Throughout this proof we write
[TABLE]
We start by proving the first statement. Let with be two arbitrary numbers. Since is increasing we have and . Hence,
[TABLE]
From condition 1 of the theorem it follows that and consequently giving us with
[TABLE]
We introduce a new sequence of random variables for which we will derive a lower tail bound, defined as given by and
[TABLE]
Comparing this with Eq. 18 we see that regardless of the value of it holds that
[TABLE]
By induction over , and since and , we have for all
[TABLE]
From the definition of it follows assuming that
[TABLE]
Then, from the law of total expectation we get that
[TABLE]
Since it immediately follows that . Furthermore, we may bound the variance of the change of given by
[TABLE]
where follows from Condition 2 of the theorem. The sequence is a martingale and hence fulfills the preconditions of Theorem A.10 (Theorem 6.6 from [17]) with and . Note that . Hence, we obtain
[TABLE]
Recalling that and and setting for some we arrive at the first statement of the theorem;
[TABLE]
Next we prove the second statement and bound . Let be a hitting time for the event that . Using as the indicator variable (which is one if and zero otherwise) we can write because is fixed and is non-increasing in resulting in . As a consequence it holds that
[TABLE]
We now proceed to bound the . Using the first statement with a union bound over all t>t_{0}\coloneqq\frac{2(\sigma+1)}{\delta^{2}}\cdot\mathopen{}\mathclose{{}\left(-\log(p)+\log\mathopen{}\mathclose{{}\left(\frac{2(\sigma+1)}{\delta^{2}}}\right)}\right) gives us
[TABLE]
As a consequence,
[TABLE]
and
[TABLE]
Recalling that Eq. 19 implies that
[TABLE]
since by the definition of and is non-increasing it holds that . It follows that
[TABLE]
As a consequence we get that with probability at least
[TABLE]
Finally, we find that
[TABLE]
Putting everything together we see with probability at least that
[TABLE]
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Milton Abramowitz and Irene A. Stegun. Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables . Dover, New York, ninth dover printing, tenth gpo printing edition, 1964.
- 2[2] Heiner Ackermann, Simon Fischer, Martin Hoefer, and Marcel Schöngens. Distributed algorithms for Qo S load balancing. Distributed Comput. , 23(5-6):321–330, 2011. doi:10.1007/s 00446-010-0125-1 . · doi ↗
- 3[3] Sinan G. Aksoy, Fan Chung, Michael Tait, and Josh Tobin. The maximum relaxation time of a random walk. Adv. Appl. Math. , 101:1–14, 2018. doi:10.1016/j.aam.2018.07.002 . · doi ↗
- 4[4] Dan Alistarh, Giorgi Nadiradze, and Amirmojtaba Sabour. Dynamic averaging load balancing on cycles. In 47th International Colloquium on Automata, Languages, and Programming, ICALP 2020 , volume 168 of LIP Ics , pages 7:1–7:16. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2020. doi:10.4230/LIP Ics.ICALP.2020.7 . · doi ↗
- 5[5] Aris Anagnostopoulos, Adam Kirsch, and Eli Upfal. Load balancing in arbitrary network topologies with stochastic adversarial input. SIAM Journal on Computing , 34(3):616–639, 2005. doi:10.1137/S 0097539703437831 . · doi ↗
- 6[6] Elliot Anshelevich, David Kempe, and Jon M. Kleinberg. Stability of load balancing algorithms in dynamic adversarial systems. SIAM J. Comput. , 37(5):1656–1673, 2008. doi:10.1137/050639272 . · doi ↗
- 7[7] Friedhelm Meyer auf der Heide, Brigitte Oesterdiekhoff, and Rolf Wanka. Strongly adaptive token distribution. Algorithmica , 15(5):413–427, 1996. doi:10.1007/BF 01955042 . · doi ↗
- 8[8] Petra Berenbrink, Colin Cooper, Tom Friedetzky, Tobias Friedrich, and Thomas Sauerwald. Randomized diffusion for indivisible loads. J. Comput. Syst. Sci. , 81(1):159–185, 2015. doi:10.1016/j.jcss.2014.04.027 . · doi ↗
