An Elastic Ephemeral Datastore using Cheap, Transient Cloud Resources
Malte Brodmann, Nikolas Ioannou, Bernard Metzler, Jonas Pfefferle, Ana, Klimovic

TL;DR
This paper introduces an elastic ephemeral datastore that leverages cheap, transient cloud spot instances to reduce costs in distributed data analytics, handling preemptions transparently and maintaining performance.
Contribution
It presents a novel elastic distributed ephemeral datastore that manages node preemptions and reduces costs for ephemeral data in cloud analytics workloads.
Findings
Achieves 60% cost reduction using spot instances
End-to-end execution time increases by only 2.1%
Handles node preemptions transparently to applications
Abstract
Spot instances are virtual machines offered at 60-90% lower cost that can be reclaimed at any time, with only a short warning period. Spot instances have already been used to significantly reduce the cost of processing workloads in the cloud. However, leveraging spot instances to reduce the cost of stateful cloud applications is much more challenging, as the sudden preemptions lead to data loss. In this work, we propose leveraging spot instances to decrease the cost of ephemeral data management in distributed data analytics applications. We specifically target ephemeral data as this large class of data in modern analytics workloads has low durability requirements; if lost, the data can be regenerated by re-executing compute tasks. We design an elastic, distributed ephemeral datastore that handles node preemptions transparently to user applications and minimizes data loss by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Data Stream Mining Techniques · Advanced Data Storage Technologies
