A Closer Look at Deep Learning Methods on Tabular Datasets
Han-Jia Ye, Si-Yang Liu, Hao-Run Cai, Qi-Le Zhou, De-Chuan Zhan

TL;DR
This study systematically evaluates deep learning methods on a large collection of tabular datasets, revealing insights into their performance, the impact of dataset heterogeneity, and practical guidance for method selection.
Contribution
It introduces TALENT, a comprehensive benchmark with 300+ datasets, and analyzes factors influencing deep learning performance on tabular data.
Findings
Pretraining models match or surpass tree-based methods on many tasks.
Ensembling improves performance across models.
Dataset heterogeneity influences method effectiveness.
Abstract
Tabular data is prevalent across diverse domains in machine learning. With the rapid progress of deep tabular prediction methods, especially pretrained (foundation) models, there is a growing need to evaluate these methods systematically and to understand their behavior. We present an extensive study on TALENT, a collection of 300+ datasets spanning broad ranges of size, feature composition (numerical/categorical mixes), domains, and output types (binary, multi--class, regression). Our evaluation shows that ensembling benefits both tree-based and neural approaches. Traditional gradient-boosted trees remain very strong baselines, yet recent pretrained tabular models now match or surpass them on many tasks, narrowing--but not eliminating--the historical advantage of tree ensembles. Despite architectural diversity, top performance concentrates within a small subset of models, providing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Image Processing and 3D Reconstruction · Computational Physics and Python Applications
MethodsFocus · Sparse Evolutionary Training
