Are Emergent Abilities of Large Language Models a Mirage?

Rylan Schaeffer; Brando Miranda; Sanmi Koyejo

arXiv:2304.15004·cs.AI·May 23, 2023·132 cites

Are Emergent Abilities of Large Language Models a Mirage?

Rylan Schaeffer, Brando Miranda, Sanmi Koyejo

PDF

Open Access 2 Videos

TL;DR

This paper argues that what appears as emergent abilities in large language models may actually be artifacts of the metrics used to evaluate them, rather than true new capabilities arising from scale.

Contribution

The authors propose an alternative explanation for emergent abilities, showing they can result from metric choices rather than fundamental model changes, supported by mathematical modeling and empirical tests.

Findings

01

Emergent abilities depend on the choice of evaluation metrics.

02

Better metrics can eliminate the appearance of emergent abilities.

03

Emergent abilities may not be intrinsic properties of scaled models.

Abstract

Recent work claims that large language models display emergent abilities, abilities not present in smaller-scale models that are present in larger-scale models. What makes emergent abilities intriguing is two-fold: their sharpness, transitioning seemingly instantaneously from not present to present, and their unpredictability, appearing at seemingly unforeseeable model scales. Here, we present an alternative explanation for emergent abilities: that for a particular task and model family, when analyzing fixed model outputs, emergent abilities appear due to the researcher's choice of metric rather than due to fundamental changes in model behavior with scale. Specifically, nonlinear or discontinuous metrics produce apparent emergent abilities, whereas linear or continuous metrics produce smooth, continuous predictable changes in model performance. We present our alternative explanation in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

ROBERT MILES - "There is a good chance this kills everyone"· youtube

Are Emergent Abilities of Large Language Models a Mirage?· slideslive

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Explainable Artificial Intelligence (XAI)