The Computational Advantage of Depth: Learning High-Dimensional Hierarchical Functions with Gradient Descent
Yatin Dandi, Luca Pesce, Lenka Zdeborov\'a, Florent Krzakala

TL;DR
This paper demonstrates that deep neural networks trained with gradient descent can efficiently learn high-dimensional hierarchical functions by successively reducing effective dimensionality, outperforming shallow models in sample efficiency.
Contribution
The paper introduces a theoretical framework for understanding how depth enables neural networks to learn hierarchical functions more efficiently than shallow models.
Findings
Deep networks successively reduce effective dimensionality during training.
Deep models require fewer samples to learn hierarchical functions.
Theoretical analysis aligns with common training procedures.
Abstract
Understanding the advantages of deep neural networks trained by gradient descent (GD) compared to shallow models remains an open theoretical challenge. In this paper, we introduce a class of target functions (single and multi-index Gaussian hierarchical targets) that incorporate a hierarchy of latent subspace dimensionalities. This framework enables us to analytically study the learning dynamics and generalization performance of deep networks compared to shallow ones in the high-dimensional limit. Specifically, our main theorem shows that feature learning with GD successively reduces the effective dimensionality, transforming a high-dimensional problem into a sequence of lower-dimensional ones. This enables learning the target function with drastically less samples than with shallow networks. While the results are proven in a controlled training setting, we also discuss more common…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Industrial Vision Systems and Defect Detection
