TL;DR
This paper introduces the hidden manifold model (HMM) to better understand how data structure influences neural network learning, providing an analytical framework that captures training dynamics and performance factors.
Contribution
The paper presents the hidden manifold model and proves the Gaussian Equivalence Property, enabling detailed analysis of neural network training on structured data.
Findings
Learning dynamics are captured by integro-differential equations.
Network performance depends on size, learning rate, and data manifold dimension.
The model explains how neural networks learn functions of increasing complexity.
Abstract
Understanding the reasons for the success of deep neural networks trained using stochastic gradient-based methods is a key open problem for the nascent theory of deep learning. The types of data where these networks are most successful, such as images or sequences of speech, are characterised by intricate correlations. Yet, most theoretical work on neural networks does not explicitly model training data, or assumes that elements of each data sample are drawn independently from some factorised probability distribution. These approaches are thus by construction blind to the correlation structure of real-world data sets and their impact on learning in neural networks. Here, we introduce a generative model for structured data sets that we call the hidden manifold model (HMM). The idea is to construct high-dimensional inputs that lie on a lower-dimensional manifold, with labels that depend…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
