A Comprehensive Benchmark of Machine and Deep Learning Across Diverse Tabular Datasets
Assaf Shmuel, Oren Glickman, Teddy Lazebnik

TL;DR
This paper presents a detailed benchmark comparing machine learning and deep learning models across 111 diverse tabular datasets, identifying conditions where DL models outperform traditional methods and training a predictive model for these scenarios.
Contribution
It offers a comprehensive comparison of 20 models on diverse datasets and introduces a predictive model for DL success scenarios, filling gaps in existing benchmarks.
Findings
DL models outperform traditional methods on specific datasets
A predictive model for DL success achieves 86.1% accuracy
Insights into dataset characteristics favoring DL models
Abstract
The analysis of tabular datasets is highly prevalent both in scientific research and real-world applications of Machine Learning (ML). Unlike many other ML tasks, Deep Learning (DL) models often do not outperform traditional methods in this area. Previous comparative benchmarks have shown that DL performance is frequently equivalent or even inferior to models such as Gradient Boosting Machines (GBMs). In this study, we introduce a comprehensive benchmark aimed at better characterizing the types of datasets where DL models excel. Although several important benchmarks for tabular datasets already exist, our contribution lies in the variety and depth of our comparison: we evaluate 111 datasets with 20 different models, including both regression and classification tasks. These datasets vary in scale and include both those with and without categorical variables. Importantly, our benchmark…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Anomaly Detection Techniques and Applications
