Analyzing Neural Scaling Laws in Two-Layer Networks with Power-Law Data   Spectra

Roman Worschech; Bernd Rosenow

arXiv:2410.09005·stat.ML·October 14, 2024

Analyzing Neural Scaling Laws in Two-Layer Networks with Power-Law Data Spectra

Roman Worschech, Bernd Rosenow

PDF

Open Access 1 Video

TL;DR

This paper provides a theoretical analysis of neural scaling laws in two-layer networks with power-law spectral data, revealing how data structure influences learning dynamics and generalization error.

Contribution

It introduces a statistical mechanics framework to analyze generalization in two-layer networks with power-law spectra, extending understanding beyond empirical observations.

Findings

01

Derives analytical expressions for generalization error with linear activations.

02

Identifies conditions for power-law scaling in learning curves.

03

Shows transition from exponential to power-law convergence in certain regimes.

Abstract

Neural scaling laws describe how the performance of deep neural networks scales with key factors such as training data size, model complexity, and training time, often following power-law behaviors over multiple orders of magnitude. Despite their empirical observation, the theoretical understanding of these scaling laws remains limited. In this work, we employ techniques from statistical mechanics to analyze one-pass stochastic gradient descent within a student-teacher framework, where both the student and teacher are two-layer neural networks. Our study primarily focuses on the generalization error and its behavior in response to data covariance matrices that exhibit power-law spectra. For linear activation functions, we derive analytical expressions for the generalization error, exploring different learning regimes and identifying conditions under which power-law scaling emerges.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Analyzing Neural Scaling Laws in Two-Layer Networks with Power-Law Data Spectra· slideslive

Taxonomy

TopicsNeural Networks and Applications