A Scalable AutoML Approach Based on Graph Neural Networks
Mossad Helali, Essam Mansour, Ibrahim Abdelaziz, Julian Dolby, Kavitha, Srinivas

TL;DR
This paper introduces KGpip, a novel meta-learning system for AutoML that leverages dataset embeddings and graph generation to improve pipeline search, demonstrating significant performance gains across diverse datasets.
Contribution
The paper presents KGpip, a new meta-learning approach that models pipeline creation as graph generation and uses dataset embeddings for better AutoML guidance.
Findings
KGpip outperforms state-of-the-art AutoML systems on 126 datasets.
Dataset embeddings effectively capture dataset similarity.
Graph-based pipeline modeling enhances pipeline diversity and quality.
Abstract
AutoML systems build machine learning models automatically by performing a search over valid data transformations and learners, along with hyper-parameter optimization for each learner. Many AutoML systems use meta-learning to guide search for optimal pipelines. In this work, we present a novel meta-learning system called KGpip which, (1) builds a database of datasets and corresponding pipelines by mining thousands of scripts with program analysis, (2) uses dataset embeddings to find similar datasets in the database based on its content instead of metadata-based features, (3) models AutoML pipeline creation as a graph generation problem, to succinctly characterize the diverse pipelines seen for a single dataset. KGpip's meta-learning is a sub-component for AutoML systems. We demonstrate this by integrating KGpip with two AutoML systems. Our comprehensive evaluation using 126 datasets,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Software Testing and Debugging Techniques · Software Engineering Research
