Practical Scaling Laws: Converting Compute into Performance in a Data-Constrained World

Christopher M. Bryant; Hao Liu

arXiv:2605.09189·cs.LG·May 12, 2026

Practical Scaling Laws: Converting Compute into Performance in a Data-Constrained World

Christopher M. Bryant, Hao Liu

PDF

TL;DR

This paper introduces a new scaling law model that accurately predicts model performance across various data regimes, addressing limitations of previous models and enabling cost-effective training strategies.

Contribution

It proposes a closed-form extension of existing scaling laws that accounts for overfitting, data scarcity, and multiple epochs, validated across diverse architectures and domains.

Findings

01

The new model outperforms previous laws in extrapolation accuracy.

02

It fits well to multiple published LLM scaling-law datasets.

03

The model enables cost-aware training optimization.

Abstract

The scaling laws guiding modern model training were calibrated for a single regime: data-rich, single-epoch pretraining. The dominant such scaling law form, Chinchilla's $L = E + A / N^{α} + B / D^{β}$ , has three structural limitations outside that regime: it diverges as unique data shrinks instead of saturating at the uninformed baseline; it cannot represent overfitting when capacity exceeds the data; and it conflates total examples seen with unique examples available. We propose a closed-form extension, $L (N, D, T) = E + (L_{0} - E) h / (1 + h)$ with $h = a / N^{α} + b / T^{β} + c N^{γ} / D^{δ}$ , that decomposes loss into undercapacity, undertraining, and overfitting terms. It saturates between the irreducible loss $E$ and an uninformed baseline $L_{0}$ fixed by the loss type, and reduces to Chinchilla in the data-rich, single-epoch limit. We validate it on four multi-epoch…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.