MultiTab: A Comprehensive Benchmark Suite for Multi-Dimensional Evaluation in Tabular Domains

Kyungeun Lee; Moonjung Eo; Hye-Seung Cho; Dongmin Kim; Ye Seul Sim; Seoyoon Kim; Min-Kook Suh; Woohyung Lim

arXiv:2505.14312·cs.LG·May 21, 2025

MultiTab: A Comprehensive Benchmark Suite for Multi-Dimensional Evaluation in Tabular Domains

Kyungeun Lee, Moonjung Eo, Hye-Seung Cho, Dongmin Kim, Ye Seul Sim, Seoyoon Kim, Min-Kook Suh, Woohyung Lim

PDF

Open Access

TL;DR

MultiTab introduces a benchmark suite that categorizes datasets by data characteristics to evaluate tabular models across diverse regimes, revealing how model performance varies with data properties.

Contribution

It provides a comprehensive, data-aware evaluation framework for tabular learning algorithms, highlighting the importance of regime-specific analysis for model selection and design.

Findings

01

Model performance varies significantly across data regimes.

02

Sample-level similarity models excel with large samples or high feature correlation.

03

Inter-feature dependency models perform best with weakly correlated features.

Abstract

Despite the widespread use of tabular data in real-world applications, most benchmarks rely on average-case metrics, which fail to reveal how model behavior varies across diverse data regimes. To address this, we propose MultiTab, a benchmark suite and evaluation framework for multi-dimensional, data-aware analysis of tabular learning algorithms. Rather than comparing models only in aggregate, MultiTab categorizes 196 publicly available datasets along key data characteristics, including sample size, label imbalance, and feature interaction, and evaluates 13 representative models spanning a range of inductive biases. Our analysis shows that model performance is highly sensitive to such regimes: for example, models using sample-level similarity excel on datasets with large sample sizes or high inter-feature correlation, while models encoding inter-feature dependencies perform best with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Imbalanced Data Classification Techniques · Adversarial Robustness in Machine Learning