Cross-Table Pretraining towards a Universal Function Space for Heterogeneous Tabular Data
Jintai Chen, Zhen Lin, Qiyuan Chen, Jimeng Sun

TL;DR
This paper introduces XTFormer, a cross-table pretrained Transformer that learns a universal function space for diverse tabular data, significantly improving performance across numerous downstream prediction tasks.
Contribution
The paper proposes XTFormer, a novel cross-table pretraining method that creates a meta-function space for versatile tabular prediction, addressing diversity and data scarcity challenges.
Findings
XTFormer outperforms XGBoost and Catboost on 72% of tasks.
It surpasses FT-Transformer and XTab on over 75% of tasks.
Achieves state-of-the-art results across 190 downstream tasks.
Abstract
Tabular data from different tables exhibit significant diversity due to varied definitions and types of features, as well as complex inter-feature and feature-target relationships. Cross-dataset pretraining, which learns reusable patterns from upstream data to support downstream tasks, have shown notable success in various fields. Yet, when applied to tabular data prediction, this paradigm faces challenges due to the limited reusable patterns among diverse tabular datasets (tables) and the general scarcity of tabular data available for fine-tuning. In this study, we fill this gap by introducing a cross-table pretrained Transformer, XTFormer, for versatile downstream tabular prediction tasks. Our methodology insight is pretraining XTFormer to establish a "meta-function" space that encompasses all potential feature-target mappings. In pre-training, a variety of potential mappings are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Image Processing and 3D Reconstruction · Data Mining Algorithms and Applications
MethodsAttention Is All You Need · Softmax · Layer Normalization · Linear Layer · FT-Transformer · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Label Smoothing · Adam · Residual Connection
