Blink: Lightweight Sample Runs for Cost Optimization of Big Data   Applications

Hani Al-Sayeh; Muhammad Attahir Jibril; Bunjamin Memishi; Kai-Uwe; Sattler

arXiv:2207.02290·cs.DC·July 7, 2022

Blink: Lightweight Sample Runs for Cost Optimization of Big Data Applications

Hani Al-Sayeh, Muhammad Attahir Jibril, Bunjamin Memishi, Kai-Uwe, Sattler

PDF

Open Access

TL;DR

Blink is a sampling-based framework that autonomously predicts dataset sizes and optimizes cluster size for in-memory big data applications, significantly reducing execution costs without prior workload knowledge.

Contribution

It introduces a novel autonomous sampling method for predicting cache dataset sizes and selecting optimal cluster sizes without relying on historical data.

Findings

01

Achieves near-optimal cluster size selection in most cases

02

Reduces execution costs by up to 47.4%

03

Uses only 4.6% of the cost of optimal sample runs

Abstract

Distributed in-memory data processing engines accelerate iterative applications by caching substantial datasets in memory rather than recomputing them in each iteration. Selecting a suitable cluster size for caching these datasets plays an essential role in achieving optimal performance. In practice, this is a tedious and hard task for end users, who are typically not aware of cluster specifications, workload semantics and sizes of intermediate data. We present Blink, an autonomous sampling-based framework, which predicts sizes of cached datasets and selects optimal cluster size without relying on historical runs. We evaluate Blink on a variety of iterative, real-world, machine learning applications. With an average sample runs cost of 4.6% compared to the cost of optimal runs, Blink selects the optimal cluster size in 15 out of 16 cases, saving up to 47.4% of execution cost compared…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCloud Computing and Resource Management · Caching and Content Delivery · IoT and Edge/Fog Computing