From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples
Robert Vacareanu, Vlad-Andrei Negru, Vasile Suciu, Mihai Surdeanu

TL;DR
This paper demonstrates that large language models can perform regression tasks effectively using in-context examples, rivaling traditional supervised methods without additional training.
Contribution
It reveals that LLMs can act as capable regressors with in-context learning, outperforming some supervised algorithms on certain datasets.
Findings
GPT-4 and Claude 3 outperform traditional regression methods on some benchmarks.
LLMs can achieve sub-linear regret in regression tasks with increasing in-context examples.
Large language models can perform regression without gradient updates or additional training.
Abstract
We analyze how well pre-trained large language models (e.g., Llama2, GPT-4, Claude 3, etc) can do linear and non-linear regression when given in-context examples, without any additional training or gradient updates. Our findings reveal that several large language models (e.g., GPT-4, Claude 3) are able to perform regression tasks with a performance rivaling (or even outperforming) that of traditional supervised methods such as Random Forest, Bagging, or Gradient Boosting. For example, on the challenging Friedman #2 regression dataset, Claude 3 outperforms many supervised methods such as AdaBoost, SVM, Random Forest, KNN, or Gradient Boosting. We then investigate how well the performance of large language models scales with the number of in-context exemplars. We borrow from the notion of regret from online learning and empirically show that LLMs are capable of obtaining a sub-linear…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Natural Language Processing Techniques
MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Dropout · Dense Connections · Label Smoothing · Residual Connection · Multi-Head Attention · Adam · Softmax
