On the origin of neural scaling laws: from random graphs to natural language

Maissam Barkeshli; Alberto Alfarano; Andrey Gromov

arXiv:2601.10684·cs.LG·January 16, 2026

On the origin of neural scaling laws: from random graphs to natural language

Maissam Barkeshli, Alberto Alfarano, Andrey Gromov

PDF

Open Access

TL;DR

This paper investigates the origins of neural scaling laws by studying simplified models like random walks on graphs and reduced language models, revealing that such laws can emerge without power law data structures.

Contribution

It demonstrates that neural scaling laws can arise in simplified settings without inherent power law data correlations and analyzes their evolution as language complexity decreases.

Findings

01

Neural scaling laws appear even in models trained on random graphs and simplified language.

02

Scaling exponents evolve monotonically as language models are simplified.

03

Reproduces key scaling behaviors with minimal transformer architectures.

Abstract

Scaling laws have played a major role in the modern AI revolution, providing practitioners predictive power over how the model performance will improve with increasing data, compute, and number of model parameters. This has spurred an intense interest in the origin of neural scaling laws, with a common suggestion being that they arise from power law structure already present in the data. In this paper we study scaling laws for transformers trained to predict random walks (bigrams) on graphs with tunable complexity. We demonstrate that this simplified setting already gives rise to neural scaling laws even in the absence of power law structure in the data correlations. We further consider dialing down the complexity of natural language systematically, by training on sequences sampled from increasingly simplified generative language models, from 4,2,1-layer transformer language models down…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Graph Neural Networks · Big Data and Digital Economy