Portfolio-driven Resource Management for Transient Cloud Servers
Prateek Sharma, David Irwin, Prashant Shenoy

TL;DR
This paper introduces a portfolio-based approach for managing transient cloud servers, enabling applications to reduce costs and revocation risks by optimally combining different server types.
Contribution
The paper proposes server portfolios modeled on financial principles, and implements ExoSphere to make applications more resilient and cost-effective on transient cloud servers.
Findings
Achieves 80% cost savings over on-demand servers
Reduces revocation risk significantly
Enables popular parallel applications to be transiency-aware
Abstract
Cloud providers have begun to offer their surplus capacity in the form of low-cost transient servers, which can be revoked unilaterally at any time. While the low cost of transient servers makes them attractive for a wide range of applications, such as data processing and scientific computing, failures due to server revocation can severely degrade application performance. Since different transient server types offer different cost and availability tradeoffs, we present the notion of server portfolios that is based on financial portfolio modeling. Server portfolios enable construction of an "optimal" mix of severs to meet an application's sensitivity to cost and revocation risk. We implement model-driven portfolios in a system called ExoSphere, and show how diverse applications can use portfolios and application-specific policies to gracefully handle transient servers. We show that…
Click any figure to enlarge with its caption.
Figure 1Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
See pages 1-last of paper.pdf
