Sampling in Cloud Benchmarking: A Critical Review and Methodological Guidelines
Saman Akbari, Manfred Hauswirth

TL;DR
This paper critically reviews sampling strategies in cloud benchmarking, highlighting prevalent issues and proposing guidelines to improve the transparency, comparability, and reliability of benchmark results.
Contribution
It identifies systematic problems in current sampling practices and offers methodological guidelines to enhance research quality in cloud benchmarking.
Findings
High prevalence of non-probability sampling
Over-reliance on a single benchmark
Restricted access to samples
Abstract
Cloud benchmarks suffer from performance fluctuations caused by resource contention, network latency, hardware heterogeneity, and other factors along with decisions taken in the benchmark design. In particular, the sampling strategy of benchmark designers can significantly influence benchmark results. Despite this well-known fact, no systematic approach has been devised so far to make sampling results comparable and guide benchmark designers in choosing their sampling strategy for use within benchmarks. To identify systematic problems, we critically review sampling in recent cloud computing research. Our analysis identifies concerning trends: (i) a high prevalence of non-probability sampling, (ii) over-reliance on a single benchmark, and (iii) restricted access to samples. To address these issues and increase transparency in sampling, we propose methodological guidelines for researchers…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
