CMA-ES for Post Hoc Ensembling in AutoML: A Great Success and   Salvageable Failure

Lennart Purucker; Joeran Beel

arXiv:2307.00286·cs.LG·July 4, 2023

CMA-ES for Post Hoc Ensembling in AutoML: A Great Success and Salvageable Failure

Lennart Purucker, Joeran Beel

PDF

Open Access

TL;DR

This paper compares CMA-ES and greedy ensemble selection in AutoML, revealing that CMA-ES's overfitting depends on the metric and proposing normalization techniques to improve its performance.

Contribution

It demonstrates the metric-dependent overfitting of CMA-ES in AutoML ensembling and introduces a normalization method to mitigate overfitting for ROC AUC.

Findings

01

CMA-ES overfits for ROC AUC and underperforms GES

02

CMA-ES outperforms GES for balanced accuracy

03

Normalization of CMA-ES weights improves ROC AUC performance

Abstract

Many state-of-the-art automated machine learning (AutoML) systems use greedy ensemble selection (GES) by Caruana et al. (2004) to ensemble models found during model selection post hoc. Thereby, boosting predictive performance and likely following Auto-Sklearn 1's insight that alternatives, like stacking or gradient-free numerical optimization, overfit. Overfitting in Auto-Sklearn 1 is much more likely than in other AutoML systems because it uses only low-quality validation data for post hoc ensembling. Therefore, we were motivated to analyze whether Auto-Sklearn 1's insight holds true for systems with higher-quality validation data. Consequently, we compared the performance of covariance matrix adaptation evolution strategy (CMA-ES), state-of-the-art gradient-free numerical optimization, to GES on the 71 classification datasets from the AutoML benchmark for AutoGluon. We found that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Imbalanced Data Classification Techniques · Anomaly Detection Techniques and Applications

MethodsHigh-Order Consensuses