Designing Machine Learning Pipeline Toolkit for AutoML Surrogate Modeling Optimization
Paulito P. Palmes, Akihiro Kishimoto, Radu Marinescu, Parikshit Ram,, Elizabeth Daly

TL;DR
This paper introduces AMLP, a toolkit that simplifies the creation and optimization of machine learning pipelines, enabling faster AutoML processes through surrogate modeling and data mining.
Contribution
The paper presents AMLP, a novel toolkit that streamlines pipeline expression and optimization, and demonstrates its efficiency with a two-stage surrogate modeling approach.
Findings
AMLP outperforms other AutoML methods within a 4-hour time budget.
The two-stage surrogate modeling approach significantly reduces optimization time.
AMLP accelerates pipeline evaluation and selection in AutoML tasks.
Abstract
The pipeline optimization problem in machine learning requires simultaneous optimization of pipeline structures and parameter adaptation of their elements. Having an elegant way to express these structures can help lessen the complexity in the management and analysis of their performances together with the different choices of optimization strategies. With these issues in mind, we created the AutoMLPipeline (AMLP) toolkit which facilitates the creation and evaluation of complex machine learning pipeline structures using simple expressions. We use AMLP to find optimal pipeline signatures, datamine them, and use these datamined features to speed-up learning and prediction. We formulated a two-stage pipeline optimization with surrogate modeling in AMLP which outperforms other AutoML approaches with a 4-hour time budget in less than 5 minutes of AMLP computation time.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification
