Representation Learning on Out of Distribution in Tabular Data

Achmad Ginanjar; Xue Li; Priyanka Singh; Wen Hua

arXiv:2502.10095·cs.LG·May 21, 2025

Representation Learning on Out of Distribution in Tabular Data

Achmad Ginanjar, Xue Li, Priyanka Singh, Wen Hua

PDF

Open Access

TL;DR

This paper introduces TCL, a lightweight contrastive learning method for tabular data that effectively handles out-of-distribution data on standard CPU hardware, outperforming existing models in classification tasks.

Contribution

The paper presents TCL, a novel contrastive learning approach tailored for tabular data that is efficient, hardware-friendly, and improves OOD detection and generalization.

Findings

01

TCL outperforms FT-Transformer and ResNet in classification accuracy.

02

TCL maintains competitive regression performance.

03

TCL requires significantly less computational resources.

Abstract

The open-world assumption in model development suggests that a model might lack sufficient information to adequately handle data that is entirely distinct or out of distribution (OOD). While deep learning methods have shown promising results in handling OOD data through generalization techniques, they often require specialized hardware that may not be accessible to all users. We present TCL, a lightweight yet effective solution that operates efficiently on standard CPU hardware. Our approach adapts contrastive learning principles specifically for tabular data structures, incorporating full matrix augmentation and simplified loss calculation. Through comprehensive experiments across 10 diverse datasets, we demonstrate that TCL outperforms existing models, including FT-Transformer and ResNet, particularly in classification tasks, while maintaining competitive performance in regression…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText and Document Classification Technologies · Data Mining Algorithms and Applications · Advanced Clustering Algorithms Research

MethodsAverage Pooling · Convolution · Global Average Pooling · Kaiming Initialization · Contrastive Learning · Max Pooling · FT-Transformer