TabFlex: Scaling Tabular Learning to Millions with Linear Attention

Yuchen Zeng; Tuan Dinh; Wonjun Kang; Andreas C Mueller

arXiv:2506.05584·cs.LG·June 9, 2025

TabFlex: Scaling Tabular Learning to Millions with Linear Attention

Yuchen Zeng, Tuan Dinh, Wonjun Kang, Andreas C Mueller

PDF

Open Access 1 Video

TL;DR

TabFlex is a scalable, efficient model that enhances large-scale tabular learning by integrating linear attention, enabling rapid processing of massive datasets with improved speed and competitive accuracy.

Contribution

This work introduces TabFlex, a novel approach that scales tabular learning to millions of samples using linear attention mechanisms, outperforming existing methods in speed and efficiency.

Findings

01

TabFlex processes large datasets in seconds, e.g., poker-hand with over a million samples in 5 seconds.

02

TabFlex achieves over 2x speedup compared to TabPFN and 1.5x over XGBoost.

03

TabFlex outperforms 25 baselines in efficiency across diverse datasets.

Abstract

Leveraging the in-context learning (ICL) capability of Large Language Models (LLMs) for tabular classification has gained significant attention for its training-free adaptability across diverse datasets. Recent advancements, like TabPFN, excel in small-scale tabular datasets but struggle to scale for large and complex datasets. Our work enhances the efficiency and scalability of TabPFN for larger datasets by incorporating linear attention mechanisms as a scalable alternative to complexity-quadratic self-attention. Our model, TabFlex, efficiently handles tabular datasets with thousands of features and hundreds of classes, scaling seamlessly to millions of samples. For instance, TabFlex processes the poker-hand dataset with over a million samples in just 5 seconds. Our extensive evaluations demonstrate that TabFlex can achieve over a 2x speedup compared to TabPFN and a 1.5x speedup over…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

TabFlex: Scaling Tabular Learning to Millions with Linear Attention· slideslive

Taxonomy

TopicsMachine Learning and Data Classification · Domain Adaptation and Few-Shot Learning · Topic Modeling