Predictability and Surprise in Large Generative Models

Deep Ganguli; Danny Hernandez; Liane Lovitt; Nova DasSarma; Tom; Henighan; Andy Jones; Nicholas Joseph; Jackson Kernion; Ben Mann; Amanda; Askell; Yuntao Bai; Anna Chen; Tom Conerly; Dawn Drain; Nelson Elhage; Sheer; El Showk; Stanislav Fort; Zac Hatfield-Dodds; Scott Johnston; Shauna Kravec,; Neel Nanda; Kamal Ndousse; Catherine Olsson; Daniela Amodei; Dario Amodei,; Tom Brown; Jared Kaplan; Sam McCandlish; Chris Olah; Jack Clark

arXiv:2202.07785·cs.CY·October 5, 2022·23 cites

Predictability and Surprise in Large Generative Models

Deep Ganguli, Danny Hernandez, Liane Lovitt, Nova DasSarma, Tom, Henighan, Andy Jones, Nicholas Joseph, Jackson Kernion, Ben Mann, Amanda, Askell, Yuntao Bai, Anna Chen, Tom Conerly, Dawn Drain, Nelson Elhage, Sheer, El Showk, Stanislav Fort, Zac Hatfield-Dodds, Scott Johnston

PDF

Open Access

TL;DR

Large generative models exhibit a paradoxical mix of predictable training loss and unpredictable capabilities, leading to rapid development but also unforeseen social harms, requiring careful policy and deployment considerations.

Contribution

This paper reveals the counterintuitive property of large models being predictable in training but unpredictable in capabilities, and discusses its implications for deployment and policy.

Findings

01

Models show predictable loss on broad distributions.

02

Specific capabilities and outputs are highly unpredictable.

03

Unpredictability can lead to socially harmful behaviors.

Abstract

Large-scale pre-training has recently emerged as a technique for creating capable, general purpose, generative models such as GPT-3, Megatron-Turing NLG, Gopher, and many others. In this paper, we highlight a counterintuitive property of such models and discuss the policy implications of this property. Namely, these generative models have an unusual combination of predictable loss on a broad training distribution (as embodied in their "scaling laws"), and unpredictable specific capabilities, inputs, and outputs. We believe that the high-level predictability and appearance of useful capabilities drives rapid development of such models, while the unpredictable qualities make it difficult to anticipate the consequences of model deployment. We go through examples of how this combination can lead to socially harmful behavior with examples from the literature and real world observations, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Computing and Data Management · Advanced Data Storage Technologies · Explainable Artificial Intelligence (XAI)