On the Invariance and Generality of Neural Scaling Laws

Xing Han; Ziyin Liu; Suchi Saria; Paul Pu Liang

arXiv:2605.07546·cs.LG·May 11, 2026

On the Invariance and Generality of Neural Scaling Laws

Xing Han, Ziyin Liu, Suchi Saria, Paul Pu Liang

PDF

TL;DR

This paper investigates how neural scaling laws can be generalized across different domains by identifying invariants and transformations that preserve or predict their behavior, enabling resource-efficient transfer of scaling insights.

Contribution

It introduces a theoretical framework based on invariants and information-theoretic transformations to develop generalizable scaling laws across diverse data modalities.

Findings

01

Scaling laws are preserved under bijective data transformations.

02

Non-bijective transformations predictably modify scaling laws based on information resolution.

03

Cross-domain predictions of model scaling are accurate within 3% error.

Abstract

Neural scaling laws establish a predictable relationship between model performance and data or compute, offering crucial guidance for resource allocation in new domains and tasks. Yet such laws are most needed precisely where they are hardest to obtain: fitting one for a new model task pair demands expensive sweeps that typically exhaust the very compute budget the law is meant to economize. This paper poses the research question of how to develop generalizable scaling laws: laws fit once on a well-resourced source domain and reliably transported to new domains where running a full sweep is infeasible, which requires a fundamental understanding of when and why scaling properties change. We address this by identifying the right invariants: scaling laws are preserved under bijective (information-preserving) transformations of the data and modified in predictable, information-theoretically…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.