Zero-shot Meta-learning for Tabular Prediction Tasks with Adversarially Pre-trained Transformer

Yulun Wu; Doron L. Bergman

arXiv:2502.04573·cs.LG·June 11, 2025

Zero-shot Meta-learning for Tabular Prediction Tasks with Adversarially Pre-trained Transformer

Yulun Wu, Doron L. Bergman

PDF

Open Access 1 Video

TL;DR

This paper introduces an adversarially pre-trained transformer that enables zero-shot tabular prediction, handling diverse datasets and class sizes, and improves performance and efficiency over existing methods.

Contribution

The paper proposes a novel adversarial pre-training approach and a mixture block architecture for zero-shot tabular prediction, addressing class size limitations and enhancing generalization.

Findings

01

Matches state-of-the-art on small classification tasks

02

Enhances performance on benchmark datasets in classification and regression

03

Maintains under one second runtime on average

Abstract

We present an Adversarially Pre-trained Transformer (APT) that is able to perform zero-shot meta-learning on tabular prediction tasks without pre-training on any real-world dataset, extending on the recent development of Prior-Data Fitted Networks (PFNs) and TabPFN. Specifically, APT is pre-trained with adversarial synthetic data agents, who continue to shift their underlying data generating distribution and deliberately challenge the model with different synthetic datasets. In addition, we propose a mixture block architecture that is able to handle classification tasks with arbitrary number of classes, addressing the class size limitation -- a crucial weakness of prior deep tabular zero-shot learners. In experiments, we show that our framework matches state-of-the-art performance on small classification tasks without filtering on dataset characteristics such as number of classes and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Zero-shot Meta-learning for Tabular Prediction Tasks with Adversarially Pre-trained Transformer· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Domain Adaptation and Few-Shot Learning

MethodsAttention Is All You Need · Label Smoothing · Byte Pair Encoding · Layer Normalization · Residual Connection · Dense Connections · Linear Layer · Multi-Head Attention · Position-Wise Feed-Forward Layer · Adam