Learning Multi-Index Models with Neural Networks via Mean-Field Langevin   Dynamics

Alireza Mousavi-Hosseini; Denny Wu; Murat A. Erdogdu

arXiv:2408.07254·stat.ML·March 28, 2025

Learning Multi-Index Models with Neural Networks via Mean-Field Langevin Dynamics

Alireza Mousavi-Hosseini, Denny Wu, Murat A. Erdogdu

PDF

Open Access

TL;DR

This paper analyzes how neural networks can efficiently learn multi-index models in high dimensions by leveraging mean-field Langevin dynamics, highlighting conditions for polynomial-time convergence and the role of low-dimensional structures.

Contribution

It introduces a framework for understanding the sample and computational complexities of neural networks learning multi-index models, and explores conditions for polynomial-time convergence of the mean-field Langevin algorithm.

Findings

01

Sample complexity scales almost linearly with effective dimension.

02

Neural networks adapt to latent low-dimensional structures.

03

Polynomial time convergence is possible on manifolds with positive Ricci curvature.

Abstract

We study the problem of learning multi-index models in high-dimensions using a two-layer neural network trained with the mean-field Langevin algorithm. Under mild distributional assumptions on the data, we characterize the effective dimension $d_{eff}$ that controls both sample and computational complexity by utilizing the adaptivity of neural networks to latent low-dimensional structures. When the data exhibit such a structure, $d_{eff}$ can be significantly smaller than the ambient dimension. We prove that the sample complexity grows almost linearly with $d_{eff}$ , bypassing the limitations of the information and generative exponents that appeared in recent analyses of gradient-based feature learning. On the other hand, the computational complexity may inevitably grow exponentially with $d_{eff}$ in the worst-case scenario. Motivated by improving…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Gaussian Processes and Bayesian Inference