Q(D)O-ES: Population-based Quality (Diversity) Optimisation for Post Hoc Ensemble Selection in AutoML
Lennart Purucker, Lennart Schneider, Marie Anastacio, Joeran Beel,, Bernd Bischl, Holger Hoos

TL;DR
This paper introduces population-based ensemble selection methods, QO-ES and QDO-ES, which aim to improve AutoML ensemble performance by optimizing for quality and diversity, outperforming traditional greedy methods in many cases.
Contribution
The paper proposes two novel ensemble selection algorithms, QO-ES and QDO-ES, incorporating diversity into the selection process, a new approach in AutoML ensemble post hoc optimization.
Findings
QO-ES and QDO-ES often outperform GES in AutoML ensemble tasks.
Diversity in ensembles can improve performance but may increase overfitting risk.
Statistically significant improvements observed on validation datasets.
Abstract
Automated machine learning (AutoML) systems commonly ensemble models post hoc to improve predictive performance, typically via greedy ensemble selection (GES). However, we believe that GES may not always be optimal, as it performs a simple deterministic greedy search. In this work, we introduce two novel population-based ensemble selection methods, QO-ES and QDO-ES, and compare them to GES. While QO-ES optimises solely for predictive performance, QDO-ES also considers the diversity of ensembles within the population, maintaining a diverse set of well-performing ensembles during optimisation based on ideas of quality diversity optimisation. The methods are evaluated using 71 classification datasets from the AutoML benchmark, demonstrating that QO-ES and QDO-ES often outrank GES, albeit only statistically significant on validation data. Our results further suggest that diversity can be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Data Stream Mining Techniques · Imbalanced Data Classification Techniques
MethodsHigh-Order Consensuses
