TabDistill: Distilling Transformers into Neural Nets for Few-Shot Tabular Classification

Pasan Dissanayake; Sanghamitra Dutta

arXiv:2511.05704·cs.LG·November 21, 2025

TabDistill: Distilling Transformers into Neural Nets for Few-Shot Tabular Classification

Pasan Dissanayake, Sanghamitra Dutta

PDF

Open Access

TL;DR

TabDistill is a novel method that distills knowledge from complex transformer models into simpler neural networks, achieving high performance on few-shot tabular classification while reducing model complexity.

Contribution

The paper introduces TabDistill, a framework for distilling transformer knowledge into neural networks, balancing parameter efficiency and few-shot learning performance.

Findings

01

Distilled neural networks outperform classical baselines with limited data.

02

Distilled models sometimes surpass original transformer models.

03

Framework achieves parameter efficiency and strong few-shot performance.

Abstract

Transformer-based models have shown promising performance on tabular data compared to their classical counterparts such as neural networks and Gradient Boosted Decision Trees (GBDTs) in scenarios with limited training data. They utilize their pre-trained knowledge to adapt to new domains, achieving commendable performance with only a few training examples, also called the few-shot regime. However, the performance gain in the few-shot regime comes at the expense of significantly increased complexity and number of parameters. To circumvent this trade-off, we introduce TabDistill, a new strategy to distill the pre-trained knowledge in complex transformer-based models into simpler neural networks for effectively classifying tabular data. Our framework yields the best of both worlds: being parameter-efficient while performing well with limited training data. The distilled neural networks…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Domain Adaptation and Few-Shot Learning