UniPredict: Large Language Models are Universal Tabular Classifiers

Ruiyu Wang; Zifeng Wang; Jimeng Sun

arXiv:2310.03266·cs.LG·January 18, 2024

UniPredict: Large Language Models are Universal Tabular Classifiers

Ruiyu Wang, Zifeng Wang, Jimeng Sun

PDF

Open Access

TL;DR

This paper introduces UniPredict, a large language model-based approach that acts as a universal predictor for tabular data, capable of handling diverse datasets and tasks with superior performance and adaptability.

Contribution

The paper presents a novel generative modeling approach using LLMs for universal tabular data prediction, trained on multiple datasets to outperform specialized models.

Findings

01

UniPredict outperforms tree-boosting and neural network baselines by 5.4% to 13.4%.

02

It demonstrates strong few-shot learning capabilities with over 100% performance gains in low-resource settings.

03

The model effectively adapts to new tabular prediction tasks with minimal data.

Abstract

Tabular data prediction is a fundamental machine learning task for many applications. Existing methods predominantly employ discriminative modeling and operate under the assumption of a fixed target column, necessitating re-training for every new predictive task. Inspired by the generative power of large language models (LLMs), this paper exploits the idea of building universal tabular data predictors based on generative modeling, namely UniPredict. Here, we demonstrate the scalability of an LLM to extensive tabular datasets, enabling it to comprehend diverse tabular inputs and predict target variables following the provided instructions. Specifically, we train a single LLM on an aggregation of 169 tabular datasets with diverse targets and compare its performance against baselines that are trained on each dataset separately. We observe this versatile UniPredict model demonstrates an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling