Plug and Play Bench: Simplifying Big Data Benchmarking Using Containers
Sheriffo Ceesay, Adam Barker, Blesson Varghese

TL;DR
Plug And Play Bench simplifies big data benchmarking by automating deployment and configuration of tools using containers, integrating cost metrics, and supporting various cluster frameworks and cloud platforms.
Contribution
It introduces a containerized, infrastructure-aware benchmarking framework that streamlines deployment and adds cost transparency for big data applications.
Findings
Automates installation, configuration, and execution of big data benchmarks.
Supports integration with cloud platforms like Azure.
Provides cost metrics alongside performance benchmarking.
Abstract
The recent boom of big data, coupled with the challenges of its processing and storage gave rise to the development of distributed data processing and storage paradigms like MapReduce, Spark, and NoSQL databases. With the advent of cloud computing, processing and storing such massive datasets on clusters of machines is now feasible with ease. However, there are limited tools and approaches, which users can rely on to gauge and comprehend the performance of their big data applications deployed locally on clusters, or in the cloud. Researchers have started exploring this area by providing benchmarking suites suitable for big data applications. However, many of these tools are fragmented, complex to deploy and manage, and do not provide transparency with respect to the monetary cost of benchmarking an application. In this paper, we present Plug And Play Bench, an infrastructure aware…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
