Sampling in Cloud Benchmarking: A Critical Review and Methodological   Guidelines

Saman Akbari; Manfred Hauswirth

arXiv:2502.15399·cs.DC·February 24, 2025

Sampling in Cloud Benchmarking: A Critical Review and Methodological Guidelines

Saman Akbari, Manfred Hauswirth

PDF

TL;DR

This paper critically reviews sampling strategies in cloud benchmarking, highlighting prevalent issues and proposing guidelines to improve the transparency, comparability, and reliability of benchmark results.

Contribution

It identifies systematic problems in current sampling practices and offers methodological guidelines to enhance research quality in cloud benchmarking.

Findings

01

High prevalence of non-probability sampling

02

Over-reliance on a single benchmark

03

Restricted access to samples

Abstract

Cloud benchmarks suffer from performance fluctuations caused by resource contention, network latency, hardware heterogeneity, and other factors along with decisions taken in the benchmark design. In particular, the sampling strategy of benchmark designers can significantly influence benchmark results. Despite this well-known fact, no systematic approach has been devised so far to make sampling results comparable and guide benchmark designers in choosing their sampling strategy for use within benchmarks. To identify systematic problems, we critically review sampling in recent cloud computing research. Our analysis identifies concerning trends: (i) a high prevalence of non-probability sampling, (ii) over-reliance on a single benchmark, and (iii) restricted access to samples. To address these issues and increase transparency in sampling, we propose methodological guidelines for researchers…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.