Learning with Shallow Neural Networks on Cluster-Structured Features
Elisabetta Cornacchia, Laurent Massouli\'e

TL;DR
This paper investigates how cluster-structured features in high-dimensional data influence the sample complexity of training shallow neural networks with gradient descent, highlighting conditions under which complexity depends only on latent variables.
Contribution
It introduces a tractable model analyzing the impact of spatial correlations on learning complexity, showing independence from input dimension under certain conditions.
Findings
Sample complexity scales with the number of latent variables.
When signal-to-noise ratio is high, complexity is independent of input dimension.
Empirical results confirm theoretical predictions on synthetic and real data.
Abstract
The success of deep learning in high-dimensional settings is often attributed to the presence of low-dimensional structure in real-world data. While standard theoretical models typically assume that this structure lies in the target function, projecting unstructured inputs onto a low-dimensional subspace, data such as images, text or genomic sequences exhibit strong spatial correlations within the input space itself. In this paper, we propose a tractable model to study how these correlations affect the sample complexity of learning with gradient descent on shallow neural networks. Specifically, we consider targets that depend on a small number of latent Boolean variables, and input features grouped into clusters and correlated with the latent variables. Under an identifiability assumption, we show that for a layerwise gradient-descent variant, the sample complexity scales with the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
