Learning Multi-Index Models with Neural Networks via Mean-Field Langevin Dynamics
Alireza Mousavi-Hosseini, Denny Wu, Murat A. Erdogdu

TL;DR
This paper analyzes how neural networks can efficiently learn multi-index models in high dimensions by leveraging mean-field Langevin dynamics, highlighting conditions for polynomial-time convergence and the role of low-dimensional structures.
Contribution
It introduces a framework for understanding the sample and computational complexities of neural networks learning multi-index models, and explores conditions for polynomial-time convergence of the mean-field Langevin algorithm.
Findings
Sample complexity scales almost linearly with effective dimension.
Neural networks adapt to latent low-dimensional structures.
Polynomial time convergence is possible on manifolds with positive Ricci curvature.
Abstract
We study the problem of learning multi-index models in high-dimensions using a two-layer neural network trained with the mean-field Langevin algorithm. Under mild distributional assumptions on the data, we characterize the effective dimension that controls both sample and computational complexity by utilizing the adaptivity of neural networks to latent low-dimensional structures. When the data exhibit such a structure, can be significantly smaller than the ambient dimension. We prove that the sample complexity grows almost linearly with , bypassing the limitations of the information and generative exponents that appeared in recent analyses of gradient-based feature learning. On the other hand, the computational complexity may inevitably grow exponentially with in the worst-case scenario. Motivated by improving…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Gaussian Processes and Bayesian Inference
