Neural Redshift: Random Networks are not Random Functions

Damien Teney; Armand Nicolicioiu; Valentin Hartmann; Ehsan Abbasnejad

arXiv:2403.02241·cs.LG·May 1, 2025·1 cites

Neural Redshift: Random Networks are not Random Functions

Damien Teney, Armand Nicolicioiu, Valentin Hartmann, Ehsan Abbasnejad

PDF

Open Access

TL;DR

This paper investigates the inductive biases of untrained neural networks, revealing that their architecture influences complexity bias, challenging the idea that neural networks inherently favor simpler functions, and offers new insights into deep learning generalization.

Contribution

It demonstrates that architectural components determine complexity bias in untrained networks, providing a new perspective beyond gradient descent explanations.

Findings

01

Untrained random networks exhibit strong inductive biases.

02

Architectural features like ReLUs and residual connections influence complexity bias.

03

Transformers inherit these properties from their components.

Abstract

Our understanding of the generalization capabilities of neural networks (NNs) is still incomplete. Prevailing explanations are based on implicit biases of gradient descent (GD) but they cannot account for the capabilities of models from gradient-free methods nor the simplicity bias recently observed in untrained networks. This paper seeks other sources of generalization in NNs. Findings. To understand the inductive biases provided by architectures independently from GD, we examine untrained, random-weight networks. Even simple MLPs show strong inductive biases: uniform sampling in weight space yields a very biased distribution of functions in terms of complexity. But unlike common wisdom, NNs do not have an inherent "simplicity bias". This property depends on components such as ReLUs, residual connections, and layer normalizations. Alternative architectures can be built with a bias…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications