Representation Learning on Out of Distribution in Tabular Data
Achmad Ginanjar, Xue Li, Priyanka Singh, Wen Hua

TL;DR
This paper introduces TCL, a lightweight contrastive learning method for tabular data that effectively handles out-of-distribution data on standard CPU hardware, outperforming existing models in classification tasks.
Contribution
The paper presents TCL, a novel contrastive learning approach tailored for tabular data that is efficient, hardware-friendly, and improves OOD detection and generalization.
Findings
TCL outperforms FT-Transformer and ResNet in classification accuracy.
TCL maintains competitive regression performance.
TCL requires significantly less computational resources.
Abstract
The open-world assumption in model development suggests that a model might lack sufficient information to adequately handle data that is entirely distinct or out of distribution (OOD). While deep learning methods have shown promising results in handling OOD data through generalization techniques, they often require specialized hardware that may not be accessible to all users. We present TCL, a lightweight yet effective solution that operates efficiently on standard CPU hardware. Our approach adapts contrastive learning principles specifically for tabular data structures, incorporating full matrix augmentation and simplified loss calculation. Through comprehensive experiments across 10 diverse datasets, we demonstrate that TCL outperforms existing models, including FT-Transformer and ResNet, particularly in classification tasks, while maintaining competitive performance in regression…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Data Mining Algorithms and Applications · Advanced Clustering Algorithms Research
MethodsAverage Pooling · Convolution · Global Average Pooling · Kaiming Initialization · Contrastive Learning · Max Pooling · FT-Transformer
