OmniPred: Language Models as Universal Regressors

Xingyou Song; Oscar Li; Chansoo Lee; Bangding Yang; Daiyi Peng; Sagi; Perel; Yutian Chen

arXiv:2402.14547·cs.LG·February 3, 2025·2 cites

OmniPred: Language Models as Universal Regressors

Xingyou Song, Oscar Li, Chansoo Lee, Bangding Yang, Daiyi Peng, Sagi, Perel, Yutian Chen

PDF

Open Access 1 Repo

TL;DR

OmniPred introduces a framework that trains language models as universal regressors capable of precise numerical predictions across diverse tasks, outperforming traditional models when trained at scale.

Contribution

The paper presents a novel approach to using language models as universal regressors for arbitrary data formats, leveraging large-scale multi-task training.

Findings

01

Language models achieve high regression accuracy with textual data.

02

Training over multiple tasks improves performance significantly.

03

Outperforms traditional regression models in experiments.

Abstract

Regression is a powerful tool to accurately predict the outcome metric of a system given a set of parameters, but has traditionally been restricted to methods which are only applicable to a specific task. In this paper, we propose OmniPred, a framework for training language models as universal end-to-end regressors over $(x, y)$ data from arbitrary formats. Using data sourced from Google Vizier, one of the largest proprietary blackbox optimization databases in the world, our extensive experiments demonstrate that language models are capable of very precise numerical regression using only textual representations of mathematical parameters and values, and if given the opportunity to train at scale over multiple tasks, can significantly outperform traditional regression models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

google-research/optformer
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques

MethodsSparse Evolutionary Training