Evaluation of a Tree-based Pipeline Optimization Tool for Automating Data Science
Randal S. Olson, Nathan Bartley, Ryan J. Urbanowicz, Jason H. Moore

TL;DR
This paper presents TPOT, an automated tool using tree-based optimization to design machine learning pipelines, improving performance and simplicity without requiring expert knowledge, thus advancing automated data science.
Contribution
Introduction of TPOT, an open-source Python tool that automates pipeline design using tree-based optimization and Pareto efficiency to enhance performance and simplicity.
Findings
TPOT outperforms basic machine learning analyses.
Pareto optimization produces more compact pipelines.
TPOT requires minimal user input.
Abstract
As the field of data science continues to grow, there will be an ever-increasing demand for tools that make machine learning accessible to non-experts. In this paper, we introduce the concept of tree-based pipeline optimization for automating one of the most tedious parts of machine learning---pipeline design. We implement an open source Tree-based Pipeline Optimization Tool (TPOT) in Python and demonstrate its effectiveness on a series of simulated and real-world benchmark data sets. In particular, we show that TPOT can design machine learning pipelines that provide a significant improvement over a basic machine learning analysis while requiring little to no input nor prior knowledge from the user. We also address the tendency for TPOT to design overly complex pipelines by integrating Pareto optimization, which produces compact pipelines without sacrificing classification accuracy. As…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Multi-Objective Optimization Algorithms · Machine Learning and Data Classification · Metaheuristic Optimization Algorithms Research
