AutoML-GPT: Automatic Machine Learning with GPT
Shujian Zhang, Chengyue Gong, Lemeng Wu, Xingchao Liu, Mingyuan Zhou

TL;DR
AutoML-GPT leverages large language models to automate the entire AI training pipeline, including data processing, model selection, and hyperparameter tuning, across diverse tasks and datasets.
Contribution
This paper introduces AutoML-GPT, a novel framework that uses GPT to dynamically automate AI model training and optimization for various tasks, reducing human effort.
Findings
Effective across computer vision and NLP tasks
Automates data processing, model selection, and hyperparameter tuning
Demonstrates robustness and generality through extensive experiments
Abstract
AI tasks encompass a wide range of domains and fields. While numerous AI models have been designed for specific tasks and applications, they often require considerable human efforts in finding the right model architecture, optimization algorithm, and hyperparameters. Recent advances in large language models (LLMs) like ChatGPT show remarkable capabilities in various aspects of reasoning, comprehension, and interaction. Consequently, we propose developing task-oriented prompts and automatically utilizing LLMs to automate the training pipeline. To implement this concept, we present the AutoML-GPT, which employs GPT as the bridge to diverse AI models and dynamically trains models with optimized hyperparameters. AutoML-GPT dynamically takes user requests from the model and data cards and composes the corresponding prompt paragraph. Ultimately, with this prompt paragraph, AutoML-GPT will…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Machine Learning and Data Classification · Explainable Artificial Intelligence (XAI)
MethodsMulti-Head Attention · Attention Is All You Need · Cosine Annealing · Softmax · Adam · Layer Normalization · Linear Layer · Dropout · Discriminative Fine-Tuning · Byte Pair Encoding
