TabRepo: A Large Scale Repository of Tabular Model Evaluations and its   AutoML Applications

David Salinas; Nick Erickson

arXiv:2311.02971·cs.LG·August 27, 2024·2 cites

TabRepo: A Large Scale Repository of Tabular Model Evaluations and its AutoML Applications

David Salinas, Nick Erickson

PDF

Open Access 3 Repos

TL;DR

TabRepo is a comprehensive dataset of tabular model evaluations that enables advanced analysis, transfer-learning, and improvements over current AutoML systems in accuracy and efficiency.

Contribution

We present TabRepo, a large-scale repository of model evaluations that facilitates analysis, transfer-learning, and enhances AutoML performance.

Findings

01

Enables comparison of Hyperparameter Optimization and AutoML systems.

02

Facilitates transfer-learning to outperform state-of-the-art systems.

03

Improves accuracy, runtime, and latency through transfer-learning techniques.

Abstract

We introduce TabRepo, a new dataset of tabular model evaluations and predictions. TabRepo contains the predictions and metrics of 1310 models evaluated on 200 classification and regression datasets. We illustrate the benefit of our dataset in multiple ways. First, we show that it allows to perform analysis such as comparing Hyperparameter Optimization against current AutoML systems while also considering ensembling at marginal cost by using precomputed model predictions. Second, we show that our dataset can be readily leveraged to perform transfer-learning. In particular, we show that applying standard transfer-learning techniques allows to outperform current state-of-the-art tabular systems in accuracy, runtime and latency.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Model-Driven Software Engineering Techniques