The Pitfalls of Benchmarking in Algorithm Selection: What We Are Getting Wrong

Ga\v{s}per Petelin; Gjorgjina Cenikj

arXiv:2505.07750·cs.LG·May 13, 2025

The Pitfalls of Benchmarking in Algorithm Selection: What We Are Getting Wrong

Ga\v{s}per Petelin, Gjorgjina Cenikj

PDF

Open Access

TL;DR

This paper critically examines common evaluation practices in algorithm selection for black-box optimization, highlighting methodological flaws that can lead to misleading performance assessments of meta-models.

Contribution

It identifies specific flaws in evaluation methods like leave-instance-out and metrics sensitive to scale, proposing the need for more rigorous assessment frameworks.

Findings

01

Leave-instance-out evaluation can be misleading

02

Scale-sensitive metrics can overestimate performance

03

Non-informative features can inflate accuracy

Abstract

Algorithm selection, aiming to identify the best algorithm for a given problem, plays a pivotal role in continuous black-box optimization. A common approach involves representing optimization functions using a set of features, which are then used to train a machine learning meta-model for selecting suitable algorithms. Various approaches have demonstrated the effectiveness of these algorithm selection meta-models. However, not all evaluation approaches are equally valid for assessing the performance of meta-models. We highlight methodological issues that frequently occur in the community and should be addressed when evaluating algorithm selection approaches. First, we identify flaws with the "leave-instance-out" evaluation technique. We show that non-informative features and meta-models can achieve high accuracy, which should not be the case with a well-designed evaluation framework.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Advanced Multi-Objective Optimization Algorithms · Metaheuristic Optimization Algorithms Research

MethodsSparse Evolutionary Training