$O(\mbox{depth})$-Competitive Algorithm for Online Multi-level   Aggregation

Niv Buchbinder; Moran Feldman; Joseph Naor; Ohad Talmon

arXiv:1701.01936·cs.DS·January 10, 2017

$O(\mbox{depth})$-Competitive Algorithm for Online Multi-level Aggregation

Niv Buchbinder, Moran Feldman, Joseph Naor, Ohad Talmon

PDF

Open Access

TL;DR

This paper introduces an online algorithm for multi-level aggregation in weighted trees that achieves an O(D)-competitive ratio, significantly improving previous algorithms with much higher competitive bounds.

Contribution

The paper presents the first O(D)-competitive online algorithm for multi-level aggregation, improving upon prior algorithms with exponential competitive ratios.

Findings

01

Achieves an O(D)-competitive ratio for the problem.

02

Improves upon the previous D^2 2^D-competitive algorithm.

03

Applicable to scenarios like multicasting and sensor networks.

Abstract

We consider a multi-level aggregation problem in a weighted rooted tree, studied recently by Bienkowski et al. (2015). In this problem requests arrive over time at the nodes of the tree, and each request specifies a deadline. A request is served by sending it to the root before its deadline at a cost equal to the weight of the path from the node in which it resides to the root. However, requests from different nodes can be aggregated, and served together, so as to save on cost. The cost of serving an aggregated set of requests is equal to the weight of the subtree spanning the nodes in which the requests reside. Thus, the problem is to find a competitive online aggregation algorithm that minimizes the total cost of the aggregated requests. This problem arises naturally in many scenarios, including multicasting, supply-chain management and sensor networks. It is also related to the well…

Equations16

i = 1 \sum D (\frac{1}{3})^{i} \cdot c (u) \leq i = 1 \sum \infty (\frac{1}{3})^{i} \cdot c (u) = \frac{c ( u ) /3}{1 - 1/3} = \frac{c ( u )}{2} .

i = 1 \sum D (\frac{1}{3})^{i} \cdot c (u) \leq i = 1 \sum \infty (\frac{1}{3})^{i} \cdot c (u) = \frac{c ( u ) /3}{1 - 1/3} = \frac{c ( u )}{2} .

\overset{c}{^} (v)

\overset{c}{^} (v)

\geq c (v) \cdot \frac{c ^ ( u )}{\frac{1}{2} ( c ^ ( u ) + c ( u ) )}

\geq c (v) \cdot \frac{c ^ ( u )}{c ^ ( u )} = c (v),

v \in \cA_{u} \sum \overset{c}{^} (v) = v \in \cA_{u} \sum [c (v) \cdot \frac{c ^ ( u )}{c ( \cA _{u} )}] = \overset{c}{^} (u) .

v \in \cA_{u} \sum \overset{c}{^} (v) = v \in \cA_{u} \sum [c (v) \cdot \frac{c ^ ( u )}{c ( \cA _{u} )}] = \overset{c}{^} (u) .

c (T) = u \in T \sum c (u) \leq u \in T \sum \overset{c}{^} (u) \leq 2 (D + 1) \cdot c (r) .

c (T) = u \in T \sum c (u) \leq u \in T \sum \overset{c}{^} (u) \leq 2 (D + 1) \cdot c (r) .

v \in \cA_{u} \sum \overset{c}{^} (v) = v \in \cA_{u} \sum [c (v) \cdot \frac{c ^ ( u )}{c ( \cA _{u} )}] = \overset{c}{^} (u) .

v \in \cA_{u} \sum \overset{c}{^} (v) = v \in \cA_{u} \sum [c (v) \cdot \frac{c ^ ( u )}{c ( \cA _{u} )}] = \overset{c}{^} (u) .

c (T^{'}) =

c (T^{'}) =

\leq

=

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsOptimization and Search Problems · Facility Location and Emergency Management · Vehicle Routing Optimization Methods

Full text

\forLoop

126calBbCounter

$O(\mbox{depth})$ -Competitive Algorithm for Online Multi-level Aggregation

Niv Buchbinder Department of Statistics and Operations Research, School of Mathematical Sciences, Tel Aviv university, Israel. Email: [email protected]. Research is supported by ISF grant 1585/15 and US-Israel BSF grant 2014414.

Moran Feldman Department of Mathematics and Computer Science, The Open University of Israel. Email: [email protected]. Research supported in part by ISF grant 1357/16.

Joseph (Seffi) Naor Computer Science Department, Technion, Israel. Email: [email protected]. Research is supported by ISF grant 1585/15 and US-Israel BSF grant 2014414.

Ohad Talmon Computer Science Department, Technion, Israel. Email: [email protected]. Research is supported by ISF grant 1585/15 and US-Israel BSF grant 2014414.

Abstract

We consider a multi-level aggregation problem in a weighted rooted tree, studied recently by Bienkowski et al. [7]. In this problem requests arrive over time at the nodes of the tree, and each request specifies a deadline. A request is served by sending it to the root before its deadline at a cost equal to the weight of the path from the node in which it resides to the root. However, requests from different nodes can be aggregated, and served together, so as to save on cost. The cost of serving an aggregated set of requests is equal to the weight of the subtree spanning the nodes in which the requests reside. Thus, the problem is to find a competitive online aggregation algorithm that minimizes the total cost of the aggregated requests. This problem arises naturally in many scenarios, including multicasting, supply-chain management and sensor networks. It is also related to the well studied TCP-acknowledgement problem and the online joint replenishment problem.

We present an online $O(D)$ -competitive algorithm for the problem, where $D$ is the depth, or number of levels, of the aggregation tree. This result improves upon the $D^{2}2^{D}$ -competitive algorithm obtained recently by Bienkowski et al. [7].

Keywords: online algorithms, competitive analysis, aggregation of requests

1 Introduction

Aggregation of tasks is a fundamental tool in optimization, utilized in various areas. Suppose that there is a set of requests that need to be served, where serving several requests together costs less than serving each request individually. There are aggregation constraints on the request system that specify which sets of requests can be jointly served, together with a function that determines the cost of serving any aggregated set of requests. For example, serving all requests together might be impossible or very costly. In general, various cost functions and aggregation constraints give rise to a large family of interesting problems.

Scenarios where aggregation is beneficial are common in supply-chain management, in which producing (or delivering) several demands together can be cheaper than producing them separately [15, 20, 1, 27]. Another example is aggregating control packets in communication networks (e.g., sensor networks) [16, 2, 11, 18]. In this setting, transmitting together several packets going to a joint destination is, again, cheaper than transmitting each packet separately. Aggregation problems were studied extensively for many settings, both in the offline case, in which all requests are known a-priori, and in the online case in which requests arrive over time.

We study a very general online aggregation setting, considered recently by [7], and known as the Online Multi-level Aggregation Problem with Deadlines. In this problem requests arrive over time, and each request specifies a deadline for serving it. Thus, each request is associated with a time interval in which it needs to be served. At any point of time $t$ , the online algorithm is only aware of requests whose arrival time is no later than $t$ .

There is a tree with non-negative edge costs rooted at a node $r$ , and each request is assumed to reside at some tree node. The cost of serving a single request is equal to the cost of the tree path from the node where it resides to the root $r$ . Requests whose intervals intersect in time can be aggregated and served together. The aggregate cost of a set of requests is equal to the cost of the subtree rooted at $r$ and spanning the nodes where these requests reside. The goal is to minimize the total service cost, i.e., the sum of the costs of the subtrees used for serving the requests.

The online multi-level aggregation problem with deadlines was studied recently by Bienkowski et al. [7] who designed a $D^{2}2^{D}$ -competitive algorithm for it, where $D$ denotes the depth of the tree. We note that this bound is quite far from the best known lower bound on the competitive factor for the problem which is only $2$ [9] (see Section 1.2 for more details). In the offline case, the problem is known to be NP-hard (and APX-hard) even when the aggregation tree is of depth $2$ [1, 25, 8]. Currently, the best (offline) approximation factor known for the problem is 2 [6, 7].

1.1 Our Results.

Our main result is a competitive algorithm for the Online Multi-level Aggregation Problem with Deadlines. We have already mentioned that there is a large gap between the lower bound of $2$ on the competitive ratio known and the current best upper bound of $D^{2}2^{D}$ [7], which is exponential in the depth $D$ of the aggregation tree. We obtain a substantial improvement over the upper bound of [7].

Theorem 1.1

There exists an $O(D)$ -competitive algorithm for the Online Multi-level Aggregation Problem with Deadlines.

Our Techniques.

Standard reductions allow us to assume that the costs are associated with the tree nodes rather than with the edges. To simplify the presentation, we first present our result for a special kind of aggregation tree known as 3-decreasing trees. In a $3$ -decreasing tree, as we traverse any path from the root $r$ to a leaf, the edge costs go down by a factor of at least $3$ at each step of the path. The idea of our algorithm for $3$ -decreasing trees is simple and intuitive. Suppose there is a request that has reached its deadline (and, thus, must be served). The algorithm recursively aggregates requests into a subtree (which eventually will be served), starting from the root of the tree. The aggregation process has to balance between two conflicting objectives. On the one hand, it is beneficial to aggregate as many requests as possible, especially those approaching their deadline. On the other hand, it is necessary to bound the cost of the aggregate subtree, so that it can be related to the optimal solution. We balance between the two by recursively providing budgets to tree nodes that are added to the aggregate subtree. A budget provided to node $u$ (that has joined the subtree) can be used to aggregate an additional (yet restricted) set of nodes into the subtree rooted at $u$ . These nodes are then also given budgets, which can be used to aggregate additional nodes (recursively).

The analysis of the algorithm is performed by first showing that the total cost of each served subtree is at most $O(D\cdot c(r))$ , where $D$ is the depth of the tree and $c(r)$ is the cost of the root $r$ . Then, we show that the optimal cost of serving the remaining requests (including requests that at this point of time have not arrived yet) decreases by at least $c(r)$ . The two claims together imply the desired bound.

Extending our result to general trees is done via a reduction that transforms a tree into a forest of $3$ -decreasing trees. Then, we apply the algorithm for $3$ -decreasing trees to each tree of the resulting forest. A very simple argument shows that the algorithm obtained this way has a competitive factor of at most $O(D^{2})$ . However, using a more involved analysis we are able to show that the two above claims hold also for general trees, which give us the promised $O(D)$ -competitiveness guarantee.

1.2 Related Work.

Several interesting problems are related to the Online Multi-level Aggregation Problem with Deadlines. Suppose that each request accrues a waiting cost over time, rather than having a strict deadline. The cost of satisfying a set of requests is then defined to be the sum of both the service cost and the sum of the waiting costs of the requests. For this problem Bienkowski et al. [7] designed an $O(D^{4}2^{D})$ -competitive algorithm. The best offline approximation for this variant is $2+\epsilon$ [26] adapting ideas from [22]. We note that a common choice for the waiting cost is a linear function over time. This model (with slight variations) was considered by Khanna et al. [19] who achieved a competitive factor which is logarithmic in the weight of the aggregation tree.

One of the earliest problems considered in the general setting of aggregation is the TCP-Acknowledgment Problem. In this problem there is a single link over which packets need to be acknowledged by control messages. Any number of control messages can be aggregated into a single packet sent over the single link. The aggregation tree, thus, consists of a single edge. The objective function is composed of two terms: the number of acknowledgments sent over the link and the sum of the waiting times of the control messages. The optimal competitive factors for the TCP-acknowledgment problem are $2$ for deterministic algorithms and $e/(e-1)$ if randomization is allowed [16, 18, 13]. Interestingly, the TCP-acknowledgment problem is equivalent to the classical Lot Sizing Problem [16, 27], which has been studied extensively by the operations research community.

The aggregation problem for trees of depth $2$ is known as the Joint Replenishment Problem in supply-chain management. The best competitive factor known for this special case is $3$ with general waiting times, and $2$ if requests have deadlines [12, 14, 9]. The best lower bound for general waiting times is 2.754 [9] improving upon an earlier bound of 2.64 by [14]. For the deadline case the best lower bound is $2$ [9]. The (offline) Joint Replenishment problem is known to be NP-hard (and APX-hard) [1, 25, 8]. The best approximation ratio currently known for it is $1.791$ [9], which improves upon previous results of [21, 24, 22, 23].

Finally, when the aggregation tree is an infinite half line, Bienkowski et al. [7] gave a $4$ -competitive algorithm for the deadline version of the problem, and showed that this is the best possible. For the more general problem with waiting costs, Bienkowski et al. [10] showed that the competitive ratio is between $2+\phi\approx 3.618$ and $5$ , improving on an earlier upper bound of $8$ for the problem by Brito et al. [12]. It is also known that the optimal offline solution for this case can be computed efficiently [10].

2 Problem Definition and Preliminaries

An instance of the Online Multi-level Aggregation Problem with Deadlines (OMAPD) is defined as a tuple $(\cT,\cI)$ . The first component of the tuple is a tree $\cT$ rooted at some node $r(\cT)$ with non-negative edge costs. We denote by $c(e)$ the cost of an edge $e$ of $\cT$ , and by $D(\cT)$ the depth of $\cT$ , i.e., the maximum number of edges along any path from the root of $\cT$ to a leaf. The second component of the tuple $(\cT,\cI)$ is a set of time intervals $\cI$ , where each interval $I=[a_{I},d_{I}]\in\cI$ is associated with a node $w_{I}\in\cT$ , an arrival time $a_{I}$ and a deadline $d_{I}$ . A solution for the problem is a sequence of subtrees $T_{1},T_{2},\ldots,T_{\ell}\subseteq\cT$ rooted at $r(\cT)$ that are transmitted at times $t_{1},t_{2},\ldots,t_{\ell}$ .111Notice that we abuse notation here and treat trees as sets of nodes. This is done occasionally throughout the paper. The solution is feasible if, for each interval $I\in\cI$ , there exists a tree $T_{i}$ with transmission time $t_{i}\in[a_{I},d_{I}]$ containing the node $w_{I}$ . Moreover, we say that a tree of the solution having these properties services $I$ . The cost of transmitting a tree $T$ is denoted by $c(T)\triangleq\sum_{e\in E[T_{i}]}c(e)$ , where $E[T_{i}]$ is the set of edges of $T_{i}$ . The cost of the solution is $\sum_{i=1}^{\ell}c(T_{i})$ .

In the online setting the tree $\cT$ is known to the algorithm in advance, but the intervals of $\cI$ arrive in an online fashion, i.e., each interval $I\in\cI$ is revealed to the online algorithm only upon its arrival time at $a_{I}$ . The online algorithm can decide at every given time to transmit a subtree of $\cT$ at this time, however, both the decision to transmit the subtree and the choice of the subtree to transmit must be done without any knowledge about future intervals. The algorithm is $\alpha$ -competitive if it produces a feasible solution whose total cost is always at most $\alpha$ times the cost of the optimal solution. In the online setting it is often useful to call intervals that have already been revealed to the algorithm, but were not serviced yet, by the term active.

To simplify the presentation of our algorithm, we first modify the problem as described by the following two modifications.

We assume the tree $\cT$ has node costs $\{c(u)\mid u\in\cT\}$ rather than edge costs. Consequently, the transmission cost of a subtree $T\subseteq\cT$ now becomes $c(T)=\sum_{u\in T}c(u)$ . 2. 2.

For each node $u$ , we assume $c(u)$ is strictly positive.

Proving that these modifications are without loss of generality can be done in a rather standard way. First, we may assume that $r(\cT)$ has only a single child. One may observe that the transmission of a subtree $T$ containing multiple children of $r(\cT)$ can always be replaced with the transmission of multiple subtrees, each containing $r(\cT)$ and the part of $T$ descending from a single child of $r(\cT)$ . Moreover, this replacement does not change the cost of the transmission. Hence, there is no loss in applying any algorithm independently to every subtree of $\cT$ (more accurately, every instance of the algorithm faces an instance of OMAPD with a tree containing $r(\cT$ ) and all the nodes descending in $\cT$ from a single child of $r(\cT)$ ). Next, we observe that in OMAPD every interval associated with the root $r(\cT)$ can always be serviced without any cost. Thus, we may assume that our instance does not contain any such intervals. We now transform $\cT$ into a node-cost tree by moving the cost of every edge to its end point which is further away from the root, and then removing the root $r(\cT)$ itself (which is the only node left with no cost). Finally, if some node $u$ has cost zero, we may merge it with the next node on the path to the root. Thus, each node has strictly positive cost. It is easy to see that any solution for the original edge weighted instance can be transformed to a solution for the resulting node weighted instance with the same cost and vice-versa. Moreover, the depth of the node weighted instance is always smaller than the depth of the original edge weighted instance. From now on we abuse notation and refer to the variant of OMAPD obtained by our modifications simply as OMAPD.

Finally, a tree $\cT$ is called 3-decreasing if, as we go along any path from the root $r(\cT)$ to a leaf, the cost of each node is smaller than the cost of the previous node by a factor of at least $3$ . This definition is used in [7], and it is similar to the weighted version of HSTs defined in [3]. Note, however, that a $3$ -decreasing tree is not exactly an (un-weighted) $3$ -HST (as defined by [4, 5, 17]) as the costs of the edges along any root to leaf path of a $3$ -decreasing tree decrease by factor of at least $3$ rather than by exactly $3$ . Moreover, different root to leaf paths of a $3$ -decreasing tree might differ in their lengths.

3 Algorithm for $3$ -Decreasing Trees

In this section we present and analyze an algorithm for OMAPD on $3$ -decreasing trees. A generalization of this algorithm for general trees is given in Section 4. Throughout the section we consider a fixed instance $(\cT,\cI)$ of OMAPD in which the tree $\cT$ is a $3$ -decreasing tree. Since we have fixed the instance of OMAPD, we can use in this section a somewhat simplified notation. More specifically, we drop the parameter from the notations $r(\cT)$ and $D(\cT)$ and use $r$ and $D$ to denote the root of $\cT$ and its depth, respectively. We also need to define some new notation. Given a set $U$ of nodes we use $c(U)$ to denote the total cost of the nodes within it, i.e., $c(U)=\sum_{u\in U}c(u)$ . Additionally, given a node $u$ we denote by $\cT_{u}$ the subtree of $\cT$ rooted at $u$ .

We remind the reader that an interval is said to be active if and only if it has already appeared, but has not been serviced yet. Our algorithm transmits a tree whenever an active interval matures, i.e., reaches its deadline. More specifically, our algorithm invokes Algorithm 1 whenever an active interval matures, and Algorithm 1 then selects a subtree $T$ of $\cT$ and transmits it. Following the transmission of $T$ , our algorithm returns to its idle state until another active interval matures. We note that there might be multiple active intervals that reach maturity at the same time. When this happens, the transmission of $T$ might not service all of these intervals, which might result in immediate additional invocation of Algorithm 1. In other words, there might be a zero time gap between consecutive executions of Algorithm 1.

Informally, Algorithm 1 starts the construction of the tree to be transmitted by assigning a budget of $\hat{c}(r)=2c(r)$ to the root of the tree. This budget is then used to recursively add new nodes to the transmission tree (and thus, serve the intervals residing in these nodes). More specifically, each node $u$ that is assigned a budget uses it to add to the transmission tree new nodes belonging to its subtree $\cT_{u}$ . The total transmission cost of these newly added nodes is roughly equal to the budget of $u$ ; and following the addition of these new nodes $u$ splits its budget proportionally among them so that this process can be repeated recursively. It is important to note that, when a node $u$ chooses nodes from its subtree $\cT_{u}$ to add to the transmission tree, it gives priority to satisfying intervals whose deadline is sooner (and thus are, intuitively, more “urgent”).

The following observation shows that each transmission of Algorithm 1 makes progress, and thus, our algorithm, as a whole, terminates.

Observation 3.1

Algorithm 1 halts and the transmission tree $T$ contains an active interval.

Proof.

Observe that Line 1 of Algorithm 1 is guaranteed to be reached at least once in every given execution of Algorithm 1. This line selects an active interval $I$ , and the nodes of the path from $r$ to $w_{I}$ are later added to $T$ .

We are now ready to present the two main lemmata necessary for the analysis of the algorithm. The first of these lemmata bounds the cost paid by the algorithm for every single transmission. The proof of this lemma is deferred to Section 3.1.

Lemma 3.1

For every subtree $T$ of $\cT$ transmitted by the algorithm, $c(T)\leq 2(D+1)\cdot c(r)$ .

The presentation of the other main lemma requires some additional notation. Assume the online algorithm transmits overall $\ell$ trees $T_{1},\ldots,T_{\ell}$ . For every $0\leq i\leq\ell$ , let $\cI_{i}$ be the set of intervals that were not yet serviced by the algorithm after it has made $i$ transmissions. Note that $\cI_{i}$ includes also intervals that arrive after the transmission time of tree $T_{i}$ , i.e., they were not active yet when $T_{i}$ was transmitted. By definition, $T_{i}$ services all the intervals $\cI_{i-1}\setminus\cI_{i}$ . Additionally, we use $OPT(\cI^{\prime})$ to denote an optimal solution for a set of intervals $\cI^{\prime}\subseteq\cI$ (more formally, $OPT(\cI^{\prime})$ is an optimal solution for the instance $(\cT,\cI^{\prime})$ of OMAPD).

Our second main lemma can now be stated as follows. We defer its proof to Section 3.2.

Lemma 3.2

$OPT(\cI_{i})\leq OPT(\cI_{i-1})-c(r)$ * for every $1\leq i\leq\ell$ .*

Analyzing our algorithm is now straightforward.

Theorem 3.1

There exists an $O(D)$ -competitive algorithm for Online Multi-level Aggregation Problem with Deadlines on $3$ -decreasing trees.

Proof.

By Lemma 3.1 the total cost suffered by our algorithm is at most $\ell\cdot[2(D+1)\cdot c(r)]$ . On the other hand, Lemma 3.2 shows that the cost of the optimal solution $OPT(\cI_{0})=OPT(\cI)$ is at least $\ell\cdot c(r)+OPT(\cI_{\ell})=\ell\cdot c(r)$ .

3.1 Proof of Lemma 3.1.

In this section we prove Lemma 3.1. Let us first recall the lemma.

Lemma 3.1

For every subtree $T$ of $\cT$ transmitted by the algorithm, $c(T)\leq 2(D+1)\cdot c(r)$ .

We begin the proof of the above lemma with the following claim.

Lemma 3.3

For every node $u$ added to $T$ , $c(\cA_{u})\leq\frac{1}{2}\left(\hat{c}(u)+c(u)\right)$ .

Proof.

Each time that Algorithm 1 adds nodes to $\cA_{u}$ , it adds only nodes belonging to the path from $u$ to $w_{I}$ of a single interval $I$ . Since $T$ is $3$ -decreasing, the total cost of the nodes on such a path is upper bounded by

[TABLE]

On the other hand, Algorithm 1 stops adding nodes to $\cA_{u}$ once $c(\cA_{u})$ exceeds $\frac{\hat{c}(u)}{2}$ , thus completing the proof.

Next, define inductively the level of nodes in $T$ as follows. The level of the root $r$ is [math]. For node $v$ , suppose that node $u$ added it to $\cA_{u}$ ; then the level of $v$ is defined to be the level of $u$ plus $1$ . Since the nodes of $\cA_{u}$ are all descendants of $u$ , the level of each node $u\in\cT$ is guaranteed to be at most $D$ .

Lemma 3.4

For every node $u\in T$ , $c(u)\leq\hat{c}(u)$ .

Proof.

We prove the lemma by induction on the level of $u$ . For the root $r$ the lemma is immediate since $\hat{c}(u)=2c(u)$ . Next, let $v$ be a node in $\cA_{u}$ (added by $u$ ). Then,

[TABLE]

where the first inequality follows from Lemma 3.3, and the second inequality follows from the induction hypothesis applied to $u$ .

Note that for every node $u$ such that $\cA_{u}\neq\varnothing$ , by construction,

[TABLE]

Hence, the following observation can be proved by induction.

Observation 3.2

In each level of $T$ , the sum of $\hat{c}(u)$ , taken over all nodes $u$ , is at most $\hat{c}(r)=2c(r).$

We now observe the following.

[TABLE]

The first inequality follows since Lemma 3.4 guarantees that $c(u)\leq\hat{c}(u)$ for every node $u\in T$ . The second inequality follows since there are at most $D+1$ possible levels, and Observation 3.2 shows that the sum of $\hat{c}(u)$ in each level is at most $2c(r)$ .

3.2 Proof of Lemma 3.2.

In this section we prove Lemma 3.2. We begin by recalling the lemma.

Lemma 3.2

$OPT(\cI_{i})\leq OPT(\cI_{i-1})-c(r)$ * for every $1\leq i\leq\ell$ .*

To prove the lemma it suffices to construct a solution $S$ servicing all the intervals of $\cI_{i}$ whose cost is at most $OPT(\cI_{i-1})-c(r)$ . The lemma will then follow since, being an optimal solution, $OPT(\cI_{i})$ is not more expensive then any other feasible solution for $(\cT,\cI_{i})$ .

Let us now construct the above mentioned solution $S$ . Recall that $T_{i}$ is the subtree of $\cT$ transmitted by the algorithm at its $i$ -th transmission, and let us denote by $t_{i}$ the time of this transmission. There are a few assumptions that we can make about $OPT(\cI_{i-1})$ . First, we may assume that $OPT(\cI_{i-1})$ transmits at most one subtree at every given time (otherwise, we may merge subtrees transmitted at the same time without increasing the total cost). Second, we may assume that $OPT(\cI_{i-1})$ makes its first transmission at time $t_{i}$ because that is the time of the earliest deadline among the deadlines of the intervals of $\cI_{i-1}$ (specifically, the interval whose deadline triggered the $i$ -th transmission of the algorithm is in $\cI_{i-1}$ and its deadline is $t_{i}$ ). Given these assumptions, there must be exactly one transmission of $OPT(\cI_{i-1})$ at time $t_{i}$ . Let us denote the subtree transmitted at this time by $T^{*}_{i}$ .

We obtain the solution $S$ from the optimal solution $OPT(\cI_{i-1})$ by applying the following two steps to the last solution.

Remove $T^{*}_{i}$ from the solution. 2. 2.

Reconstruction step: Scan the intervals of $\cI_{i}$ in a non-decreasing order of their deadline. For every such interval $I$ we do the following to guarantee that it is serviced by $S$ .

•

Find the subtree of the current solution transmitted within the range $[t_{i},d_{I}]$ which contains the largest fraction of the path from $r$ to $w_{I}$ (breaking ties arbitrarily), and add to this subtree the remaining part of this path. Note that the above tree might contain all the path, in which case there is no need to add anything.

•

If the current solution makes no transmissions within the range $[t_{i},d_{I}]$ , then introduce into it a new transmission transmitting the subtree consisting solely of the path from $r$ to $w_{I}$ . The location of this new transmission within the range $[t_{i},d_{I}]$ can be chosen arbitrarily.

Observation 3.3

The solution $S$ is a feasible solution for $(\cT,\cI_{i})$ .

Proof.

The construction of $S$ guarantees that every interval of $\cI_{i}$ is serviced by $S$ before its deadline.

We are left to prove that the cost of $S$ is at most $OPT(\cI_{i})-c(r)$ . It is clear that Step 1 in the procedure for constructing $S$ decreases the cost by $c(T^{*}_{i})$ , thus, it suffices to show that the second step increases the cost of $S$ by at most $c(T^{*}_{i})-c(r)$ . The rest of this section is devoted to proving this claim.

We say that a node $u$ is reconstructed whenever it is added to some transmission of the solution during Step 2 of the above procedure.

Lemma 3.5

The reconstruction step obeys the following claims.

Only nodes of $T^{*}_{i}$ are reconstructed. 2. 2.

Each node of $T^{*}_{i}$ is reconstructed at most once.

Proof.

Consider a node $u$ which is reconstructed, and let $I$ be the interval whose processing by the reconstruction step caused the first reconstruction of $u$ . Thus, $u$ is on the path from $r$ to $w_{I}$ . If $u$ does not belong to $T^{*}_{i}$ , then $OPT(\cI_{i-1})$ must service $I$ by some transmission other than $T^{*}_{i}$ within the range $(t_{i},d_{I}]$ . Since this transmission is not removed, the processing of $I$ by the reconstruction step could not cause the reconstruction of any node, which contradicts the definition of $u$ . Hence, any node $u$ which is reconstructed must belong to $T^{*}_{i}$ .

It remains to prove the second part of the lemma. The fact that $u$ was reconstructed when $I$ was processed means that prior to $I$ ’s processing the interval $[t_{i},d_{I}]$ did not contain any transmission involving $u$ . Since the reconstruction step scans the intervals in a non-decreasing order of their deadlines, this means that any interval $I^{\prime}$ for which $u$ is on the path from $r$ to $w_{I^{\prime}}$ must have a deadline $d_{I^{\prime}}\geq d_{I}$ . Hence, when $I^{\prime}$ is processed by the reconstruction step, the range $[t_{i},d_{I^{\prime}}]\supseteq[t_{i},d_{I}]$ already includes a subtree containing $u$ , and thus, $I^{\prime}$ does not cause another reconstruction of the node $u$ .

The last lemma shows that only a limited set of nodes (namely, the nodes of $T^{*}_{i}$ ) might be reconstructed. Lemma 3.7 shows that even within this limited set there is a significant subset of nodes that are not reconstructed. The following lemma proves a few technical claims used later in the proof of Lemma 3.7.

Lemma 3.6

Let $u\in T_{i}$ be a node that is reconstructed. Then, $c(\cA_{u})>\frac{\hat{c}(u)}{2}$ and $\cA_{u}\subseteq T^{*}_{i}$ .

Proof.

Let $I$ be the interval that caused the reconstruction of $u$ . Since $I$ belongs to $\cI_{i}$ (intervals outside of $\cI_{i}$ are not processed by the reconstruction step), it must be that $I$ is not serviced by $T_{i}$ . On the other hand, the fact that $I$ caused the reconstruction of nodes means that it is not serviced by any subtree of $OPT(\cI_{i-1})$ other than $T^{*}_{i}$ , thus, it must be active at time $t_{i}$ .

The above observations imply that when Algorithm 1 constructed $T_{i}$ it left the inner loop growing $\cA_{u}$ while there were still active intervals associated with nodes of $\cT_{u}\setminus T$ ; which can only happen when $c(\cA_{u})>\frac{\hat{c}(u)}{2}$ . Next, consider any node $v\in\cA_{u}$ . Since Algorithm 1 scans the intervals in a non-decreasing deadline order while growing $\cA_{u}$ , $v$ must have been added to $A_{u}$ due to being on the path from $r$ to $w_{I^{\prime}}$ of some interval $I^{\prime}$ having a deadline $d_{I^{\prime}}\leq d_{I}$ . Assume towards a contradiction that $T^{*}_{i}$ does not contain this path. Clearly, this assumption implies that $OPT(\cI_{i-1})$ services $I^{\prime}$ by some subtree transmitted during the range $(t_{i},d_{I^{\prime}}]$ . Since any subtree servicing $I^{\prime}$ must include $u$ , $u$ is already present within the range $(t_{i},d_{I^{\prime}}]\subseteq[t_{i},d_{I}]$ when $I$ is processed by the reconstruction step; which contradicts the definition of $I$ as the interval whose processing caused the reconstruction of $u$ .

Lemma 3.7

There exists a set $U$ of nodes obeying the following properties.

$U\subseteq T^{*}_{i}\cap T_{i}$ . 2. 2.

$\sum_{u\in U}\hat{c}(u)=\hat{c}(r)=2c(r)$ . 3. 3.

$\hat{c}(u)\leq 2c(u)$ * for all $u\in U$ .* 4. 4.

The nodes of $U$ are not reconstructed.

Proof.

We start with a set $U$ obeying all the properties other than Property 4, and let it evolve while maintaining these properties. The evolution of $U$ ends as soon as it obeys also Property 4. More specifically, we initially set $U=\{r\}$ . Observe that this set indeed satisfies all the properties other than Property 4. If $U$ also satisfies Property 4 then we are done. Otherwise, there must be a node $u$ in the current set $U$ which is reconstructed. Since we maintain $U$ as a set obeying Property 1, $u$ must belong to $T_{i}$ . Hence, by Lemma 3.6, it must hold that $c(\cA_{u})\geq\frac{1}{2}\hat{c}(u)$ and $\cA_{u}\subseteq T^{*}_{i}$ .

At this point we evolve $U$ by removing $u$ from it and adding the nodes of $\cA_{u}$ instead. Since $\cA_{u}\subseteq T_{i}\cap T^{*}_{i}$ , Property 1 is preserved. Additionally, $\sum_{v\in\cA_{u}}\hat{c}(v)=\sum_{v\in\cA_{u}}\left[c(v)\cdot\frac{\hat{c}(u)}{c(\cA_{u})}\right]=\hat{c}(u)$ , and thus, Property 2 is preserved as well. Finally, Property 3 also remains valid since Algorithm 1 sets $\hat{c}(v)=c(v)\cdot\frac{\hat{c}(u)}{c(\cA_{u})}\leq 2c(v)$ for each node $v\in\cA_{u}$ (where the inequality holds since Lemma 3.6 implies $c(\cA_{u})\geq\frac{1}{2}\hat{c}(u)$ ).

We can now repeat the above evolution step as long as Property 4 is violated. However, this evolution cannot continue forever since each step of it replaces a single node with nodes appearing lower than it in $\cT$ . Hence, the evolution presented is guaranteed to end up eventually with a set $U$ obeying Property 4 (as well as the three other properties).

To conclude the proof of Lemma 3.2 we note that, by combining all the properties of $U$ in Lemma 3.7, we get that there must exist a set $U$ of node in $T^{*}_{i}$ which are not reconstructed and have a total cost of at least $\sum_{u\in U}c(u)\geq\frac{1}{2}\sum_{u\in U}\hat{c}(u)=c(r)$ . Together with the claim of Lemma 3.5 that only nodes of $T^{*}_{i}$ are reconstructed, and even they can only be reconstructed once, we get that the increase in the cost of $S$ during the reconstruction step is at most $c(T^{*}_{i})-\sum_{u\in U}c(u)\leq c(T^{*}_{i})-c(r)$ .

4 Algorithm for General Trees

In this section we show how the algorithm from Section 3 can be modified to be $O(D(\cT))$ -competitive also for OMAPD on general trees. We begin by describing a reduction, originally showed by [7], transforming a general tree into a forest of $3$ -decreasing trees.

Given a tree $\cT$ , we construct a forest $\cF$ of $3$ -decreasing trees as follows. The nodes of $\cF$ are the same as the nodes of $\cT$ . We connect each node $u$ with an edge to the first node $v$ on the path from $u$ to $r(\cT)$ whose cost obeys $c(v)\geq 3c(u)$ . If there is no such $v$ , then $u$ becomes the root of a new tree. Each node $u$ in the forest is now associated with a set $B_{u}$ of all nodes on the path from $u$ to $v$ in the original tree $\cT$ (without $v$ , but including $u$ itself). If $v$ does not exist (i.e., $u$ is a root in the new forest), then the associated set $B_{u}$ is defined as the set of nodes on the path from $u$ to the root $r(\cT)$ of the original tree (this time, including $r(\cT)$ ).

Observation 4.1

The forest $\cF$ consists solely of $3$ -decreasing trees.

Proof.

By definition, if $u$ is a node of $\cF$ which is not a root of its tree, then the father $v$ of $u$ in $\cF$ obeys $c(v)\geq 3c(u)$ .

Assume that $\cF$ consists of $m$ trees $\cT^{1},\cT^{2},\dotsc,\cT^{m}$ . Notice that $\cT$ and $\cF$ have the same set of nodes, and thus, each interval is naturally associated with a node in one of the trees in the forest. Let $\cI^{i}$ be the set of intervals associated with the nodes of $\cT^{i}$ .

Our algorithm for general trees runs an independent instance of the algorithm for $3$ -decreasing trees from Section 3 on each tree $\cT^{i}$ with its corresponding set of intervals $\cI^{i}$ . For convenience, we denote the algorithm from Section 3 by $ALG$ from this point on. Whenever an instance of $ALG$ chooses to transmit a subtree $T\subseteq\cT^{i}$ , we transmit instead the tree $\bigcup_{u\in T}B_{u}$ . It is useful to call $T$ the virtual tree transmitted by $ALG$ , and $\bigcup_{u\in T}B_{u}$ the concrete tree transmitted by $ALG$ . Observe that the concrete tree is always a subtree of the original tree $\cT$ , and thus, this is a description of a valid algorithm for the original problem (OMAPD on general trees).

We next show that the total cost of the optimal solutions for all the subtrees of the forest $\cF$ is bounded by the optimal cost of the original instance. Formally, let $OPT(\cT^{\prime},\cI^{\prime})$ denote the cost of the optimal solution for an instance $(\cT^{\prime},\cI^{\prime})$ of OMAPD. Then,

Observation 4.2

$OPT(\cT,\cI)\geq\sum_{i=1}^{m}OPT(\cT^{i},\cI^{i})$ .

Proof.

Consider the optimal solution for the instance $(\cT,\cI)$ of OMAPD. Based on this optimal solution we construct a solution for every one of the instances $\{(\cT^{i},\cI^{i})\}_{i=1}^{m}$ as follows. Whenever the optimal solution for $(\cT,\cI)$ transmits a subtree $T^{*}$ , the solution for the instance $(\cT^{i},\cI^{i})$ transmits the subtree $T^{*}\cap\cT^{i}$ . One can verify that $T^{*}\cap\cT^{i}$ is indeed a tree since the set of nodes on the path connecting every node $u\in\cT^{i}$ to $r(\cT^{i})$ is a subset of the set of nodes connecting $u$ to $r(\cT)$ in $\cT$ .

We complete the proof by observing that the costs of the above solutions add up to exactly $OPT(\cT,\cI)$ , and thus, the total cost of the optimal solutions for the instances $\{(\cT^{i},\cI^{i})\}_{i=1}^{m}$ cannot exceed this value.

The above observation implies that in order to analyze our algorithm it is enough to relate the cost of the concrete trees transmitted by each instance of $ALG$ to the cost of the optimal solution for the instance of OMAPD faced by this instance of $ALG$ . Notice that this is slightly different from what we do in Section 3 since in Section 3 we relate the cost of the virtual trees transmitted by an instance of $ALG$ to the cost of the optimal solution for the instance of OMAPD faced by this instance of $ALG$ . Nevertheless, we show in the rest of this section that the arguments from Section 3 can be used, almost without change, to prove also the more ambitious goal we need to prove here. More specifically, we prove the following proposition.

Proposition 4.1

For every $1\leq i\leq m$ , the total cost of the concrete trees transmitted by the instance of $ALG$ assigned to $\cT^{i}$ is at most $O(D(\cT))\cdot OPT(\cT^{i},\cI^{i})$ .

Clearly Theorem 1.1 follows from Observation 4.2 and Proposition 4.1. To prove Proposition 4.1 we need to define some additional notation. Assume $ALG$ transmit $\ell^{i}$ trees when given $(\cT^{i},\cI^{i})$ . For every $0\leq j\leq\ell^{i}$ , let $\cI^{i}_{j}$ be the set of intervals from $\cI^{i}$ that were not yet serviced by the algorithm after it has made $i$ transmissions. Note that $\cI^{i}_{j}$ includes also intervals that arrive after the $i$ -th transmission, i.e., they were not active yet when this transmission was made. Using this notation we can state the following lemma, which is a counterpart of Lemma 3.2, and also follows from it.

Lemma 4.1

$OPT(\cI^{i}_{j})\leq OPT(\cI^{i}_{j-1})-c(r(\cT^{i}))$ * for every $1\leq i\leq m$ and $1\leq j\leq\ell^{i}$ .*

To prove Proposition 4.1 we also need the following counterpart of Lemma 3.1.

Lemma 4.2

If the instance of $ALG$ corresponding to $(\cT^{i},\cI^{i})$ transmits a virtual tree $T$ , then $c\left(\bigcup_{u\in T}B_{u}\right)\leq\sum_{u\in T}c(B_{u})\leq 6(D(\cT)+1)\cdot c(r(\cT^{i}))$ .

One can verify that Lemmata 4.1 and 4.2 imply Proposition 4.1 in the same way that Lemmata 3.1 and 3.2 imply Theorem 3.1. Thus, it remains to prove Lemma 4.2. Recall that $ALG$ generates each one of the virtual trees it transmits by executing Algorithm 1, and consider the execution of Algorithm 1 which generated the virtual tree $T$ .

The sets $\{B_{u}\}_{u\in T}$ might not be disjoint. However, for the purposes of the proof it is useful to assume they are disjoint. In other words, if a node $v$ appears in several sets out of $\{B_{u}\}_{u\in T}$ , we treat each one of its appearances as unique. Additionally, let us use the shorthand $T^{\prime}=\bigcup_{u\in T}B_{u}$ .

By Lemma 3.4, $c(u)\leq\hat{c}(u)$ for every node $u\in T$ . Recall that $\hat{c}(u)$ is defined by Algorithm 1 only for nodes $u\in T$ . We now extend the definition of $\hat{c}$ to all the nodes of $T^{\prime}$ by setting $\hat{c}(v)=\hat{c}(u)$ for every node $v\in B_{u}$ . Since, by definition, the nodes of $B_{u}$ have costs of at most $3c(u)$ , we get the following corollary.

Corollary 4.1

For every node $u\in T^{\prime}$ , $c(u)\leq 3\hat{c}(u)$ .

Next, define levels for the nodes of $T^{\prime}$ using the following recursive procedure. The level of the root $r(\cT^{i})$ is [math]. Consider now a node $u\in T$ which already has a level $\ell_{u}$ . Then, for every node $v\in\cA_{u}$ , the nodes of $B_{v}$ are assigned the levels $\ell_{u}+1,\ell_{u}+2,\dotsc,\ell_{u}+|B_{v}|$ ; where the last level is assigned to $v$ itself. The definition of $\cA_{u}$ guarantees that all the nodes of $B_{v}$ appear on the path from $u$ to $v$ in $\cT$ , and thus, the difference between the level of $u$ and $v$ is at most the difference between their heights in $\cT$ . Thus, no node of $T^{\prime}$ is given a level larger than $D(\cT)$ by the above procedure.222There is an alternative way to define this level assignment which might help to understand the intuition behind it. Consider a tree $T_{\cA}$ defined as follows. The root of $T_{\cA}$ is $r(\cT^{i})$ , and the children of every node $u\in T_{\cA}$ are the nodes of $\cA_{u}$ . One can verify that $T_{\cA}$ has exactly the same set of nodes as $T$ . Moreover, the height of every node in $T_{\cA}$ corresponds to its level according to the level assignment used in Section 3. If we now replace every node $u\in T_{\cA}$ with a path consisting of the nodes of $B_{u}$ in which $u$ is the lowest node, then we get a new tree $\cT_{\cB}$ . This time, this tree has the same set of nodes as $T^{\prime}$ , and the height of every node in $T_{\cB}$ corresponds to its level according to the level assignment used in this section.

Recall that for every node $u\in T$ such that $\cA_{u}\neq\varnothing$ , by construction,

[TABLE]

Let us now define for a node $u\in T$ an additional set $\cB_{u}=\{w\in B_{v}\mid v\in\cA_{u},\ell_{w}=\ell_{u}+1\}$ . In other words, $\cB_{u}$ is obtained from $\cA_{u}$ by replacing every node $v\in\cA_{u}$ with the single node $w\in B_{v}$ whose level is larger by $1$ than the level of $u$ . Since $\hat{c}(w)$ is identical for every node of $B_{v}$ we immediately get also $\sum_{v\in\cB_{u}}\hat{c}(v)=\hat{c}(u)$ whenever $\cB_{u}\neq\varnothing$ . For a node $u\in T^{\prime}\setminus T$ we define $\cB_{u}=\{w\in B_{v}\mid\ell_{w}=\ell_{u}+1\wedge\exists_{v\in T}u,w\in B_{v}\}$ . Informally, $\cB_{u}$ contains the single node that belongs to the same set $B_{v}$ as $u$ and has a level larger by $1$ than $u$ . Again, the fact that $\hat{c}(w)$ is identical for every node of a single set $B_{v}$ guarantees $\sum_{v\in\cB_{u}}\hat{c}(v)=\hat{c}(u)$ .

Lemma 4.3

In each level $0\leq\ell\leq D(\cT)$ , the sum of $\hat{c}(u)$ , taken over all nodes of level $\ell$ , is at most $\hat{c}(r(\cT^{i}))=2c(r(\cT^{i})).$

Proof.

We have seen that $\sum_{v\in\cB_{u}}\hat{c}(v)=\hat{c}(u)$ for every node $u\in T^{\prime}$ unless $\cB_{u}=\varnothing$ . Additionally, the definition of the sets $\cB_{u}$ guarantees that each node $v\in T^{\prime}$ belongs to a single set $\cB_{u}$ , and this set is associated with a node $u$ of a lower level than $v$ . All this observations, taken together, imply the lemma by a standard induction argument.

We now observe the following.

[TABLE]

The first inequality follows since Corollary 4.1 guarantees that $c(u)\leq 3\hat{c}(u)$ for every node $u\in T^{\prime}$ . The second inequality follows since there are at most $D(\cT)+1$ possible levels, and Lemma 4.3 shows that the sum of $\hat{c}(u)$ in each level is at most $2c(r(\cT^{i}))$ .

5 Conclusions

In this paper we have presented an $O(D)$ -competitive algorithm for the Online Multi-level Aggregation Problem with Deadlines. This result represents an exponential improvement over the previously best competitive ratio of $D^{2}2^{D}$ given by [7]. Nevertheless, the competitive ratio of our algorithm is still quite far from the constant lower bounds proved by [9] and [7]. Narrowing this gap, either by providing an improved algorithm or by proving stronger lower bounds, is an intriguing open problem that we leave open.

Bibliography27

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Esther Arkin, Dev Joneja, and Robin Roundy. Computational complexity of uncapacitated multi-echelon production planning problems. Operations research letters , 8:63–73, 1989.
2[2] B. R. Badrinath and Pradeep Sudame. Gathercast: the design and implementation of a programmable aggregation mechanism for the internet. In Proceedings Ninth International Conference on Computer Communications and Networks, ICCCN 2000, 16-18 October 2000, Las Vegas, Nevada, USA , pages 206–213, 2000.
3[3] Nikhil Bansal, Niv Buchbinder, Aleksander Madry, and Joseph Naor. A polylogarithmic-competitive algorithm for the k -server problem. J. ACM , 62(5):40, 2015.
4[4] Yair Bartal. Probabilistic approximations of metric spaces and its algorithmic applications. In FOCS’96: Proceedings of the 37th Annual IEEE Symposium on Foundations of Computer Science , pages 184–193, 1996.
5[5] Yair Bartal. On approximating arbitrary metrices by tree metrics. In STOC’98: Proceedings of the 30th Annual ACM Symposium on Theory of Computing , pages 161–168, 1998.
6[6] Luca Becchetti, Alberto Marchetti-Spaccamela, Andrea Vitaletti, Peter Korteweg, Martin Skutella, and Leen Stougie. Latency-constrained aggregation in sensor networks. ACM Trans. Algorithms , 6(1), 2009.
7[7] Marcin Bienkowski, Martin Böhm, Jaroslaw Byrka, Marek Chrobak, Christoph Dürr, Lukáš Folwarczný, Lukasz Jez, Jirí Sgall, Nguyen Kim Thang, and Pavel Veselý. Online algorithms for multi-level aggregation. Co RR , abs/1507.02378, 2015. To appear in ESA 2016.
8[8] Marcin Bienkowski, Jaroslaw Byrka, Marek Chrobak, Neil B. Dobbs, Tomasz Nowicki, Maxim Sviridenko, Grzegorz Swirszcz, and Neal E. Young. Approximation algorithms for the joint replenishment problem with deadlines. J. Scheduling , 18(6):545–560, 2015.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

O(\mboxdepth)O(\mbox{depth})O(\mboxdepth)-Competitive Algorithm for Online Multi-level Aggregation

Abstract

1 Introduction

1.1 Our Results.

Theorem 1.1

Our Techniques.

1.2 Related Work.

2 Problem Definition and Preliminaries

3 Algorithm for 333-Decreasing Trees

Observation 3.1

Lemma 3.1

Lemma 3.2

Theorem 3.1

3.1 Proof of Lemma 3.1.

**Lemma **3.1

Lemma 3.3

Lemma 3.4

Observation 3.2

3.2 Proof of Lemma 3.2.

**Lemma **3.2

Observation 3.3

Lemma 3.5

Lemma 3.6

Lemma 3.7

4 Algorithm for General Trees

Observation 4.1

Observation 4.2

Proposition 4.1

Lemma 4.1

Lemma 4.2

Corollary 4.1

Lemma 4.3

5 Conclusions

$O(\mbox{depth})$ -Competitive Algorithm for Online Multi-level Aggregation

3 Algorithm for $3$ -Decreasing Trees

Lemma 3.1

Lemma 3.2