Predicting Emergent Capabilities by Finetuning

Charlie Snell; Eric Wallace; Dan Klein; Sergey Levine

arXiv:2411.16035·cs.LG·November 26, 2024

Predicting Emergent Capabilities by Finetuning

Charlie Snell, Eric Wallace, Dan Klein, Sergey Levine

PDF

Open Access

TL;DR

This paper introduces a method to predict when emergent capabilities will appear in future large language models by finetuning current models and fitting a predictive function, aiding understanding of model scaling.

Contribution

It proposes a novel approach to forecast emergence points in LLMs by finetuning and modeling, enabling predictions with smaller models and limited compute.

Findings

01

Finetuning shifts emergence points to less capable models.

02

Accurate predictions of emergence with models trained on up to 4x more compute.

03

Validated approach on four standard NLP benchmarks.

Abstract

A fundamental open challenge in modern LLM scaling is the lack of understanding around emergent capabilities. In particular, language model pretraining loss is known to be highly predictable as a function of compute. However, downstream capabilities are far less predictable -- sometimes even exhibiting emergent jumps -- which makes it challenging to anticipate the capabilities of future models. In this work, we first pose the task of emergence prediction: given access to current LLMs that have random few-shot accuracy on a task, can we predict whether future models (GPT-N+1) will have non-trivial accuracy on that task? We then discover a simple insight for this problem: finetuning LLMs on a given task can shift the point in scaling at which emergence occurs towards less capable models. To operationalize this insight, we can finetune LLMs with varying amounts of data and fit a parametric…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComplex Systems and Decision Making