TabTransformer: Tabular Data Modeling Using Contextual Embeddings

Xin Huang; Ashish Khetan; Milan Cvitkovic; Zohar Karnin

arXiv:2012.06678·cs.LG·December 15, 2020·180 cites

TabTransformer: Tabular Data Modeling Using Contextual Embeddings

Xin Huang, Ashish Khetan, Milan Cvitkovic, Zohar Karnin

PDF

Open Access 5 Repos

TL;DR

TabTransformer introduces a self-attention based architecture for tabular data that improves prediction accuracy, robustness, and interpretability, outperforming existing deep learning models and matching tree-based methods.

Contribution

It is the first to apply Transformer-based contextual embeddings to tabular data, enhancing performance and robustness in supervised and semi-supervised learning.

Findings

01

Outperforms state-of-the-art deep learning methods by at least 1.0% in mean AUC.

02

Matches the performance of tree-based ensemble models.

03

Semi-supervised pre-training yields an average 2.1% AUC improvement.

Abstract

We propose TabTransformer, a novel deep tabular data modeling architecture for supervised and semi-supervised learning. The TabTransformer is built upon self-attention based Transformers. The Transformer layers transform the embeddings of categorical features into robust contextual embeddings to achieve higher prediction accuracy. Through extensive experiments on fifteen publicly available datasets, we show that the TabTransformer outperforms the state-of-the-art deep learning methods for tabular data by at least 1.0% on mean AUC, and matches the performance of tree-based ensemble models. Furthermore, we demonstrate that the contextual embeddings learned from TabTransformer are highly robust against both missing and noisy data features, and provide better interpretability. Lastly, for the semi-supervised setting we develop an unsupervised pre-training procedure to learn data-driven…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Explainable Artificial Intelligence (XAI)

MethodsLinear Layer · TabTransformer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Softmax · Multi-Head Attention · Residual Connection · Attention Is All You Need · Byte Pair Encoding · Layer Normalization