End-to-End Compression for Tabular Foundation Models
Guri Zab\"ergja, Rafiq Kamel, Arlind Kadra, Christian M. M. Frey, Josif Grabocka

TL;DR
TACO is a novel dataset compression method for tabular foundation models that significantly reduces inference time and memory usage while maintaining performance, enabling scalable and efficient in-context learning.
Contribution
It introduces an end-to-end dataset compression approach for tabular models, improving scalability and efficiency over existing transformer-based methods.
Findings
Up to 94x faster inference time
Up to 97% less memory consumption
Maintains performance without significant degradation
Abstract
The long-standing dominance of gradient-boosted decision trees for tabular data has recently been challenged by in-context learning tabular foundation models. In-context learning methods fit and predict in one forward pass without parameter updates by leveraging the training data as context for predicting on query test points. While recent tabular foundation models achieve state-of-the-art performance, their transformer architecture based on the attention mechanism has quadratic complexity regarding dataset size, which in turn increases the overhead on training and inference time, and limits the capacity of the models to handle large-scale datasets. In this work, we propose TACO, an end-to-end tabular compression model that compresses the training dataset in a latent space. We test our method on the TabArena benchmark, where our proposed method is up to 94x faster in inference time,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Machine Learning and Data Classification · Advanced Neural Network Applications
