Benchmarking as Empirical Standard in Software Engineering Research
Wilhelm Hasselbring

TL;DR
This paper discusses the role and requirements of benchmarks in empirical software engineering, aiming to establish standards for benchmarking practices to improve research comparability and validity.
Contribution
It proposes a framework for benchmarking standards in empirical software engineering, focusing on performance and scalability evaluation.
Findings
Benchmarks are essential for comparing methods in software engineering.
Current standards lack explicit benchmarking checklists.
A proposed set of requirements for benchmarking standards.
Abstract
In empirical software engineering, benchmarks can be used for comparing different methods, techniques and tools. However, the recent ACM SIGSOFT Empirical Standards for Software Engineering Research do not include an explicit checklist for benchmarking. In this paper, we discuss benchmarks for software performance and scalability evaluation as example research areas in software engineering, relate benchmarks to some other empirical research methods, and discuss the requirements on benchmarks that may constitute the basis for a checklist of a benchmarking standard for empirical software engineering research.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
