Incremental Search Space Construction for Machine Learning Pipeline Synthesis
Marc-Andr\'e Z\"oller, Tien-Dung Nguyen, Marco F. Huber

TL;DR
This paper introduces a data-centric, incremental search space construction method for automated machine learning pipeline synthesis, improving efficiency and adaptability in complex pipeline design.
Contribution
It presents a novel approach that incrementally expands the search space using meta-features, enabling more flexible and data-specific ML pipelines.
Findings
Effective pruning of pipeline search space.
Competitive performance on AutoML benchmarks.
Enhanced pipeline construction flexibility.
Abstract
Automated machine learning (AutoML) aims for constructing machine learning (ML) pipelines automatically. Many studies have investigated efficient methods for algorithm selection and hyperparameter optimization. However, methods for ML pipeline synthesis and optimization considering the impact of complex pipeline structures containing multiple preprocessing and classification algorithms have not been studied thoroughly. In this paper, we propose a data-centric approach based on meta-features for pipeline construction and hyperparameter optimization inspired by human behavior. By expanding the pipeline search space incrementally in combination with meta-features of intermediate data sets, we are able to prune the pipeline structure search space efficiently. Consequently, flexible and data set specific ML pipelines can be constructed. We prove the effectiveness and competitiveness of our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
