Comparing Task-Agnostic Embedding Models for Tabular Data
Frederik Hoppe, Lars Kleinemeier, Astrid Franz, Udo G\"obel

TL;DR
This paper evaluates task-agnostic embeddings from tabular foundation models and finds that simple feature engineering often outperforms these models in various tasks with less computational cost.
Contribution
It systematically compares learned embeddings from foundation models with classical methods across multiple tasks, highlighting the efficiency of simple feature engineering.
Findings
Simple feature engineering matches or exceeds foundation models in performance.
Classical methods require less computational resources.
Foundation models do not significantly outperform traditional approaches.
Abstract
Recent foundation models for tabular data achieve strong task-specific performance via in-context learning. Nevertheless, they focus on direct prediction by encapsulating both representation learning and task-specific inference inside a single, resource-intensive network. This work specifically focuses on representation learning, i.e., on transferable, task-agnostic embeddings. We systematically evaluate task-agnostic representations extracted from tabular foundation models (TabPFN, TabICL and TabSTAR) alongside classical feature engineering (TableVectorizer and a sphere model) across a variety of application tasks as outlier detection (ADBench) and supervised learning (TabArena Lite). We find that simple feature engineering methods achieve comparable or superior performance while requiring significantly less computational resources than tabular foundation models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning · Explainable Artificial Intelligence (XAI)
