Inverse scaling can become U-shaped
Jason Wei, Najoung Kim, Yi Tay, Quoc V. Le

TL;DR
This paper investigates inverse scaling in language models, revealing that many tasks previously showing worse performance with larger models actually exhibit U-shaped scaling, improving again at larger sizes, especially with certain prompting techniques.
Contribution
It extends the analysis of inverse scaling to larger models and training compute, discovering U-shaped scaling patterns and demonstrating mitigation via prompt engineering techniques.
Findings
Only 4 of 11 inverse tasks remain inverse at larger scale.
6 tasks show U-shaped scaling with performance recovering at larger sizes.
Prompting techniques like 1-shot and chain-of-thought further mitigate undesirable scaling.
Abstract
Scaling up language models has been empirically shown to improve performance on a wide range of downstream tasks. However, if we were to observe worse performance as a function of scale ("inverse scaling") on certain tasks, this would indicate that scaling can also encourage behaviors that are misaligned with human preferences. The Inverse Scaling Prize (McKenzie et al. 2022) identified eleven such inverse scaling tasks, evaluated on models of up to 280B parameters and up to 500 zettaFLOPs of training compute. This paper takes a closer look at these inverse scaling tasks. We evaluate models of up to 540B parameters, trained on five times more compute than those evaluated in the Inverse Scaling Prize. With this increased range of model sizes and training compute, only four out of the eleven tasks remain inverse scaling. Six out of the eleven tasks exhibit "U-shaped scaling", where…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Software Engineering Research · Explainable Artificial Intelligence (XAI)
