Predicting Emergent Capabilities by Finetuning
Charlie Snell, Eric Wallace, Dan Klein, Sergey Levine

TL;DR
This paper introduces a method to predict when emergent capabilities will appear in future large language models by finetuning current models and fitting a predictive function, aiding understanding of model scaling.
Contribution
It proposes a novel approach to forecast emergence points in LLMs by finetuning and modeling, enabling predictions with smaller models and limited compute.
Findings
Finetuning shifts emergence points to less capable models.
Accurate predictions of emergence with models trained on up to 4x more compute.
Validated approach on four standard NLP benchmarks.
Abstract
A fundamental open challenge in modern LLM scaling is the lack of understanding around emergent capabilities. In particular, language model pretraining loss is known to be highly predictable as a function of compute. However, downstream capabilities are far less predictable -- sometimes even exhibiting emergent jumps -- which makes it challenging to anticipate the capabilities of future models. In this work, we first pose the task of emergence prediction: given access to current LLMs that have random few-shot accuracy on a task, can we predict whether future models (GPT-N+1) will have non-trivial accuracy on that task? We then discover a simple insight for this problem: finetuning LLMs on a given task can shift the point in scaling at which emergence occurs towards less capable models. To operationalize this insight, we can finetune LLMs with varying amounts of data and fit a parametric…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Systems and Decision Making
