MON: Mission-optimized Overlay Networks
Bruce Spang, Anirudh Sabnis, Ramesh Sitaraman, Don Towsley, Brian, DeCleene

TL;DR
This paper introduces MON, a hybrid overlay network architecture that combines offline planning and online adaptation to maximize organizational utility across distributed sites.
Contribution
The paper presents the first overlay network designed specifically for mission utility optimization, integrating predictive and reactive components.
Findings
MON achieves near-optimal utility in diverse network conditions.
Combining offline and online systems improves robustness and responsiveness.
First overlay network optimized for mission utility.
Abstract
Large organizations often have users in multiple sites which are connected over the Internet. Since resources are limited, communication between these sites needs to be carefully orchestrated for the most benefit to the organization. We present a Mission-optimized Overlay Network (MON), a hybrid overlay network architecture for maximizing utility to the organization. We combine an offline and an online system to solve non-concave utility maximization problems. The offline tier, the Predictive Flow Optimizer (PFO), creates plans for routing traffic using a model of network conditions. The online tier, MONtra, is aware of the precise local network conditions and is able to react quickly to problems within the network. Either tier alone is insufficient. The PFO may take too long to react to network changes. MONtra only has local information and cannot optimize non-concave mission…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
MON: Mission-optimized Overlay Networks
B. Spang, A. Sabnis, R. Sitaraman, D. Towsley
College of Information & Computer Sciences
U. Massachusetts - Amherst
Amherst, MA 01003
{bspang,asabnis,ramesh,towsley}@cs.umass.edu
B. DeCleene
BAE Systems & Technology Solutions
Burlington, MA 01803
Abstract
Large organizations often have users in multiple sites which are connected over the Internet. Since resources are limited, communication between these sites needs to be carefully orchestrated for the most benefit to the organization. We present a Mission-optimized Overlay Network (MON), a hybrid overlay network architecture for maximizing utility to the organization. We combine an offline and an online system to solve non-concave utility maximization problems. The offline tier, the Predictive Flow Optimizer (PFO), creates plans for routing traffic using a model of network conditions. The online tier, MONtra, is aware of the precise local network conditions and is able to react quickly to problems within the network. Either tier alone is insufficient. The PFO may take too long to react to network changes. MONtra only has local information and cannot optimize non-concave mission utilities. However, by combining the two systems, MON is robust and achieves near-optimal utility under a wide range of network conditions. While best-effort overlay networks are well studied, our work is the first to design overlays that are optimized for mission utility.
I Introduction
Large organizations have users in multiple sites that are connected over the Internet. A business may have multiple offices around the world which need to communicate with each other. A defense organization may have personnel deployed at multiple sites, who need to communicate and fulfill specific mission goals. A retailer may have multiple shops, warehouse locations, and offices. One traditional approach to facilitating communication between distributed sites of an organization is to deploy a private enterprise network with dedicated infrastructure to fulfill the organization’s communication requirements. However, an alternate approach is to build an overlay on top of the public Internet, avoiding the need for dedicated infrastructure.
Overlays have been studied and built for the past 25 years [9, 1, 26, 24, 29]. Large CDNs such as Akamai [21] have built overlays for delivering web and video content since the late 1990’s[25]. These overlays are “best-effort”, in that they are concerned with providing higher reliability and performance than what the native Internet can offer for all traffic using the overlay. Such best-effort overlays include caching overlays for Web content [7], routing overlays for reliably transporting live video streams [3, 17], P2P overlays for downloads [26, 24, 29], and security overlays for preventing DDoS attacks [25]. However, best-effort overlays do not explicitly optimize the “mission goals” of the organization that operates the overlay.
In this paper, in contrast to best-effort overlays studied in prior work, we propose and study overlays that are driven by explicitly stated mission goals. For instance, consider a multi-site defense organization. The goals for the overlay are set by an operator who dictates the relative utility of various types of communication that occur between the different sites. Note that the mission goals may vary with time, e.g., an urgent all-hands video conference watched by users in all the sites may take higher precedence than downloads, VOIP and other traffic classes. A Mission-optimized Overlay Network (MON) dynamically allocates available overlay resources to the traffic between sites to maximize overall mission utility, enabling the goals of the organization to be met.
I-A MON Functionality
MON takes as input the (time-varying) mission goals set by the operator and routes traffic on the overlay network to meet these goals (see Figure 1). Each site is connected to the public Internet through a transport controller which performs overlay routing.
We group the traffic between sites into a set of classes , where each traffic class represents a set of end-user sessions of a specific type (such as video, downloads, VOIP, etc…) between a specific source and destination site. Each traffic class may use a set of overlay routes. The mission goals are captured by mission utility functions specified by the overlay operator. MON determines a set of sessions from each class and a rate for each session so as to maximize the cumulative mission utility.
MON is designed to continually adapt to change. The mission utility functions can change as mission goals change. The number of sessions in each class that need to be routed can change with user demand. The underlying Internet could suffer from failures that require traffic to be rerouted. To deal with these changes, MON continually adapts the number of sessions and rates for each class to optimize the mission utility.
I-B Our Contributions
We propose a novel two-tiered overlay architecture for MON that combines offline and online tiers. The offline tier is called the Predictive Flow Optimizer (PFO), and it periodically performs a global optimization of the cumulative mission utility. The output of PFO is “mapped” to a lower-level online network transport mechanism called MONtra which performs the actual routing of traffic in the network using proportionally-fair utility functions (see Figure 1). We prove that MON’s two-tiered architecture converges to an optimal cumulative mission utility. An interesting aspect of our work is a mapping process that allows us to implement arbitrary non-decreasing mission utility functions using logarithmic transport utilities which are well-studied and have desirable properties such as proportional fairness.
To establish the real-world feasibility of MON, we implement a prototype within the Deterlab [19] testbed. We show that the PFO implementation using a bilinear global optimizer, in combination with MONtra implemented on the Deterlab nodes, is able to send traffic at rates that converge to a solution that achieves optimal mission utility. We also show that the system is robust to changes in the network, such as those caused by network partitions or congestion. Further, we show that the system is robust to changes in the number of sessions, such as those caused by flash crowd events. We also empirically evaluate MON when the network and traffic demands are not precisely known. In this case, we show that MON degrades gracefully and still provides a near-optimal mission utility.
I-C Roadmap
We give an overview of the MON architecture and a detailed description of each component in Section II. We describe our prototype and experimental setup in detail in Section III. We implement MON and present our empirical results in Section IV. We compare MON to prior work in Section V and then conclude in Section VI.
II The MON Architecture
MON dynamically allocates available overlay resources to traffic between sites to maximize overall mission utility. In order to do this, we need a precise definition of the mission utility maximization problem. We group the traffic between sites into a set of classes , where each traffic class represents a set of sessions of a specific type (e.g. video or VOIP) between a specific source and destination site. Each class uses a set of overlay routes . Mission goals are captured by mission utility functions specified by the overlay operator. Specifically, associated with each class is a function that corresponds to the value of one session of class receiving a rate of . MON chooses a number of sessions for each class and routes each chosen session of class at a rate of so as to maximize the cumulative mission utility expressed as .
MON has a two-tiered architecture which combines a non-real-time global optimizer (PFO) with a distributed real-time transport protocol (MONtra) to optimize mission utility. It is a novel application of the divide-and-conquer principle in network design. We use predicted global knowledge to periodically “push” the overlay network into an optimized state. We maintain the network in a near-optimal state, even in the presence of sudden network changes (such as partitions or congestion), using a mission-aware distributed transport protocol. In this section, we describe the following three major aspects of the architecture (see Figure 1):
- •
The Predictive Flow Optimizer (Section II-A) solves an optimization problem to come up with a plan for routing traffic in the MON. It solves a non-concave bilinear optimization problem periodically using the predicted network state and projected future traffic conditions.
- •
MONtra (Section II-B) solves an online optimization problem to react to changes in the network. Using ideas from network utility maximization [12], it adjusts the sending rates of each site to solve a convex optimization problem.
- •
A mapping between PFO and MONtra (Section II-C) ensures that when PFO has full knowledge of the network, MONtra will converge to PFO’s target rates. Our main result is that this convergence happens if MONtra has the same gradient as PFO at the target rates.
II-A Predictive Flow Optimizer
The Predictive Flow Optimizer (PFO) outputs a routing plan for the network that maximizes mission utility. It runs periodically using a prediction of future network conditions and traffic demands.
PFO performs “call admission” by choosing a number of sessions to admit in each class . It can decide to admit no sessions at all for a given class by setting to zero. In addition, PFO chooses a per-flow rate to provide to each admitted session of class along a route in the network. The output of PFO is then used to set the parameters of the MONtra controllers, a process we call “mapping”. Thus, PFO solves a hard global optimization problem, albeit in a non-realtime fashion using predicted traffic and network states.
The Optimization problem: PFO runs periodically and solves the following optimization problem to route a predicted set of sessions on the overlay. PFO takes as input a set of traffic classes . Each traffic class has a set of sessions that need to be routed from a specific source site to a specific destination site. For instance, a traffic class could be all the VOIP phone calls made from a given site to another given site.
PFO has the option to send traffic along different paths in the network. For example, it might send traffic directly from one site to another, or send it indirectly via a number of enclaves. We say that a flow corresponds to the unique pair of a traffic class and a route through the network. For each class , PFO has a set of of possible flows. Each flow starts at the source and ends at the destination associated with the class, using zero or more sites as intermediate nodes. Let denote the set of flows for all classes.
PFO uses a model of the underlying network to pick a feasible set of rates for the flows in each class. Let be the set of underlay network links, and the set of link capacities. The capacity of link is . For convenience, we will write to denote the links used by flow and to denote the flows that use link . PFO uses an estimate of the set of links , and an estimate of the link capacities.
PFO is said to have full knowledge of the network if and . At a minimum, PFO knows the uplinks for each site and their capacities, i.e., is the set of uplinks from the MONtra nodes to the public Internet (see Figure 1). Between these two extremes, PFO may incorporate partial knowledge of the links and capacities using tools from network tomography (e.g. [5]).
In addition to the above, the overlay operator provides PFO a mission utility function , for each traffic class , representing the value of giving one session of class a rate of . We assume that the mission utility functions are from , are subdifferentiable everywhere, and are non-decreasing. We do not assume that they are concave, in order to incorporate mission utilities for inelastic traffic [10].
For each traffic class , PFO outputs a number of allowed sessions . For each class and possible flow , PFO outputs a target rate which corresponds to the amount of traffic MON sends for a single session along flow . To do so, PFO solves the following non-concave optimization problem:
[TABLE]
Solving the Optimization Problem: The above optimization problem is NP-Hard, since the number of sessions must be an integer and the mission utility functions are not concave. However, for certain mission utility functions there are optimization techniques which make solving this problem more feasible. For instance, if the mission utility functions are piecewise linear, the problem becomes a bilinear program that can be solved efficiently using the ANTIGONE solver [20]. Other approaches for solving non-concave network utility maximization problems offline are described in the literature (e.g. [10]).
II-B MON Transport Control (MONtra)
MONtra works at the transport layer of MON, and is responsible for reacting rapidly to changes in the underlying network. MONtra consists of weighted proportionally-fair congestion controllers, which route session traffic to match the rate chosen by PFO.
For each overlay route, MONtra’s controllers optimize the transport-layer utility function (which should not be confused with the mission utility function ). In Section II-C, we will describe how to choose weights for these controllers so that they provably converge to PFO’s target rates. In an attempt to make the mapping easier to understand, we will model MONtra as using one controller per session on a flow. It’s possible to extend the model to combine all the sessions on a flow into one controller. MONtra solves the following optimization problem, where is the set of links in the network and is the capacity of a link:
[TABLE]
We solve this optimization problem using techniques from Network Utility Maximization [13]. We initialize the controllers to the the rate selected by PFO, then adapt the rate based on network feedback. After each success/loss signal, we adjust rates according to the following update rules, where is a constant chosen for stability:
[TABLE]
Note that unlike existing multipath TCP research (e.g. [13, 28, 15, 11, 22]), MONtra uses uncoupled controllers. We would like our controllers to exactly match the target rates chosen by PFO, which implies the transport optimization problem should have a unique optima. Unfortunately, the coupled controllers in the multipath literature allow multiple optima.
Instead of using a window-based controller, we use a rate based controller which sends packets at a rate of . To do this, we generate delays between each packet so that the packet sending process is Poisson with rate .
II-C Mapping
The mapping layer is responsible for ensuring that MONtra converges to the set of rates chosen by PFO. Intuitively, we would like the transport utility function to act like the mission utility functions in the vicinity of the target rates selected by PFO. If we could ensure that MONtra sends at the same rate as PFO and has the same derivative at PFO’s target rates, MONtra might behave in the same way as PFO even if there were slight changes to the network. The following theorem uses similar intuition and allows us to prove that MONtra converges to PFO’s target rates.
Theorem 1**.**
Suppose PFO has full knowledge of the network and selects a set of rates and a number of sessions for each class that maximizes the mission utility function . For each link , let be the dual variable associated with the capacity constraint for link . Fix the number of sessions for each class in the transport layer to . Using the following transport utility functions for a given flow with class , MONtra’s rates will converge to :
[TABLE]
Proof:
See Appendix A ∎
Note that if the only active constraint in the PFO solution is constraint (2), this mapping confirms the earlier intuition that we should match the PFO gradient at the target operating point. Since only the rate-related constraints are active for PFO, by the PFO KKT conditions, . Therefore, our mapping simplifies to . At the target operating point, the partial derivative of the MONtra utility function with respect to a flow is . So in addition to matching the rate, in this case we would also expect the MONtra utility functions to approximate the PFO utility functions close to the operating point.
This mapping theorem also works for any implementation of MONtra and other formulations of PFO optimization problem. For instance, MONtra could use other classes of concave transport utility functions, or another method of distributed optimization such as backpressure routing. The PFO optimization problem could use another way of combining per-flow rates, e.g. by summing the weighted rates across all paths instead of summing the rates across all paths.
III Evaluation Methodology
To show how MON performs in a realistic setting, we implemented MON and ran it on a set of network topologies, traffic scenarios, and mission utility functions as outlined below.
Network Testbed. We ran the experiments on Deterlab [19], which allowed us to allocate physical linux machines for each site and router in the network. We used Linux’s traffic control system to set the network bandwidth. We used token bucket filters with a burst size of and a maximum queue latency of , which provides stable throughput when we transfer files between the hosts.
Network Topologies. We emulated the small triangle topology shown in Figure LABEL:sub@fig:topology:triangle to illustrate MON behavior in an easier to understand context. We also emulated several large topologies from [16] (AT&T USA, Bell Canada, BTN, and Abilene). We present the results for the AT&T topology shown in Figure LABEL:sub@fig:topology:att in most of our experiments.
Mission Utility Functions. We use the following two mission utility functions to illustrate the behavior of MON, though our system works for arbitrary mission utility functions:
[TABLE]
The type mission utility function has non-concavity and monotonically increasing utility with diminishing returns. The type mission utility function increases linearly with rate without a point of diminishing returns. These functions are shown in Figure 4.
PFO Implementation. Since our approach reduces the PFO optimization to a bilinear program, we used the ANTIGONE solver [20] to efficiently solve the optimization problem. The solver uses branch-and-bound techniques and convex relaxations such as McCormick’s envelopes which allow bilinear optimization problems to be solved efficiently.
MONtra Implementation. The MONtra implementation is based on multipath network utility maximization theory from [13]. We used the controllers described in Section II-B with per-packet acks to detect congestion. We gave each host an infinite backlog of data to send. We always used the number of sessions chosen by PFO. We set the stability constant and further improved stability by dividing by the largest weight (i.e. ).
IV Evaluation Results
IV-A Does the overlay optimize mission utility?
Our first experiment shows that MON optimizes mission utility for simple scenarios. We give PFO full knowledge of the network topology and capacities, and show that MONtra converges to PFO’s target rates. We show this for both a simple topology and a more realistic one.
For the simple topology, we set up three nodes on Deterlab using the triangle topology shown in Figure LABEL:sub@fig:topology:triangle. We set the capacity between Node B and Node C to 5Mbps and set the capacity of all other links to 10Mbps. We have two traffic classes, one between Node A and Node C and one between Node B and Node C. PFO assigns a rate of 10 Mbps between Node A and Node C and a rate of 5 Mbps from Node A to Node C via Node B. Table LABEL:sub@table:triangle-rates shows the rates MONtra converged to for the two flows. In this simple example, there are no shared links between the flow. Note that the actual rates achieved by MONtra is close to the target rate set by PFO.
We also ran this experiment on the more realistic AT&T network topology. We set the capacity of all links to 10Mbps, and ran PFO using the actual topology and capacities. Unlike the triangle experiment, PFO shared links between flows. Table LABEL:sub@fig:topology:att shows the rates MONtra converged to, which are close to the rates computed by PFO.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] D. Andersen, H. Balakrishnan, F. Kaashoek, and R. Morris. Resilient overlay networks. SIGOPS Oper. Syst. Rev. , 35(5):131–145, Oct. 2001.
- 2[2] D. G. Andersen, A. C. Snoeren, and H. Balakrishnan. Best-path vs. multi-path overlay routing. In Proceedings of the 3rd ACM SIGCOMM Conference on Internet Measurement , IMC ’03, pages 91–100, New York, NY, USA, 2003. ACM.
- 3[3] K. Andreev, B. M. Maggs, A. Meyerson, and R. K. Sitaraman. Designing overlay multicast networks for streaming. In Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architectures , pages 149–158. ACM, 2003.
- 4[4] S. Boyd and L. Vandenberghe. Convex Optimization . Cambridge University Press, New York, NY, USA, 2004.
- 5[5] R. Castro, M. Coates, G. Liang, R. Nowak, and B. Yu. Network tomography: recent developments. Statistical science , pages 499–517, 2004.
- 6[6] M. Chiang, S. H. Low, A. R. Calderbank, and J. C. Doyle. Layering as optimization decomposition: A mathematical theory of network architectures. Proceedings of the IEEE , 95(1):255–312, Jan 2007.
- 7[7] J. Dilley, B. Maggs, J. Parikh, H. Prokop, R. Sitaraman, and B. Weihl. Globally distributed content delivery. Internet Computing, IEEE , 6(5):50–58, 2002.
- 8[8] Z. Duan, Z.-L. Zhang, and Y. T. Hou. Service overlay networks: Slas, qos, and bandwidth provisioning. IEEE/ACM Trans. Netw. , 11(6):870–883, Dec. 2003.
