OmniPred: Language Models as Universal Regressors
Xingyou Song, Oscar Li, Chansoo Lee, Bangding Yang, Daiyi Peng, Sagi, Perel, Yutian Chen

TL;DR
OmniPred introduces a framework that trains language models as universal regressors capable of precise numerical predictions across diverse tasks, outperforming traditional models when trained at scale.
Contribution
The paper presents a novel approach to using language models as universal regressors for arbitrary data formats, leveraging large-scale multi-task training.
Findings
Language models achieve high regression accuracy with textual data.
Training over multiple tasks improves performance significantly.
Outperforms traditional regression models in experiments.
Abstract
Regression is a powerful tool to accurately predict the outcome metric of a system given a set of parameters, but has traditionally been restricted to methods which are only applicable to a specific task. In this paper, we propose OmniPred, a framework for training language models as universal end-to-end regressors over data from arbitrary formats. Using data sourced from Google Vizier, one of the largest proprietary blackbox optimization databases in the world, our extensive experiments demonstrate that language models are capable of very precise numerical regression using only textual representations of mathematical parameters and values, and if given the opportunity to train at scale over multiple tasks, can significantly outperform traditional regression models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
MethodsSparse Evolutionary Training
