MementoML: Performance of selected machine learning algorithm configurations on OpenML100 datasets
Wojciech Kretowicz, Przemys{\l}aw Biecek

TL;DR
This paper presents a systematic benchmark dataset of machine learning algorithms' performance across various hyperparameters on OpenML100 datasets, enabling analysis of hyperparameter sensitivity.
Contribution
It introduces a comprehensive, pre-defined hyperparameter grid benchmark dataset for 7 algorithms on 39 datasets, facilitating performance analysis independent of hyperparameter tuning.
Findings
Performance sensitivity to hyperparameters analyzed
Benchmark dataset publicly available for research
Systematic approach differs from typical hyperparameter tuning
Abstract
Finding optimal hyperparameters for the machine learning algorithm can often significantly improve its performance. But how to choose them in a time-efficient way? In this paper we present the protocol of generating benchmark data describing the performance of different ML algorithms with different hyperparameter configurations. Data collected in this way is used to study the factors influencing the algorithm's performance. This collection was prepared for the purposes of the study presented in the EPP study. We tested algorithms performance on dense grid of hyperparameters. Tested datasets and hyperparameters were chosen before any algorithm has run and were not changed. This is a different approach than the one usually used in hyperparameter tuning, where the selection of candidate hyperparameters depends on the results obtained previously. However, such selection allows for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Data Analysis with R
