Analyzing the Impact of Undersampling on the Benchmarking and Configuration of Evolutionary Algorithms
Diederick Vermetten, Hao Wang, Manuel L\'opez-Iba\~nez and, Carola Doerr, Thomas B\"ack

TL;DR
This paper investigates how undersampling affects the reliability of benchmarking and configuration of evolutionary algorithms, highlighting the need for sufficient sample sizes and proper statistical methods to ensure accurate comparisons.
Contribution
It demonstrates that common benchmarking practices and automated configuration methods are often underpowered due to insufficient sample sizes, and emphasizes the importance of statistical considerations.
Findings
15 runs may be insufficient for reliable algorithm ranking
Performance losses of over 20% can occur with inadequate sampling
Statistical tests improve comparison reliability but are not foolproof
Abstract
The stochastic nature of iterative optimization heuristics leads to inherently noisy performance measurements. Since these measurements are often gathered once and then used repeatedly, the number of collected samples will have a significant impact on the reliability of algorithm comparisons. We show that care should be taken when making decisions based on limited data. Particularly, we show that the number of runs used in many benchmarking studies, e.g., the default value of 15 suggested by the COCO environment, can be insufficient to reliably rank algorithms on well-known numerical optimization benchmarks. Additionally, methods for automated algorithm configuration are sensitive to insufficient sample sizes. This may result in the configurator choosing a `lucky' but poor-performing configuration despite exploring better ones. We show that relying on mean performance values, as many…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Multi-Objective Optimization Algorithms · Metaheuristic Optimization Algorithms Research · Evolutionary Algorithms and Applications
