Learning Gaussian Multi-Index Models with Gradient Flow: Time Complexity   and Directional Convergence

Berfin \c{S}im\c{s}ek; Amire Bendjeddou; Daniel Hsu

arXiv:2411.08798·cs.LG·March 12, 2025

Learning Gaussian Multi-Index Models with Gradient Flow: Time Complexity and Directional Convergence

Berfin \c{S}im\c{s}ek, Amire Bendjeddou, Daniel Hsu

PDF

Open Access

TL;DR

This paper analyzes the gradient flow dynamics of neural networks approximating multi-index functions on Gaussian data, revealing conditions for convergence, time complexity, and phase transitions based on the geometry of index vectors.

Contribution

It generalizes single-index results to multi-index functions, characterizes fixed points for orthogonal vectors, and identifies thresholds affecting convergence with correlation loss.

Findings

01

Neurons converge to index vectors when vectors are orthogonal.

02

Polynomial time complexity for the search phase in multi-index models.

03

Correlation loss effectiveness depends on the orthogonality of index vectors.

Abstract

This work focuses on the gradient flow dynamics of a neural network model that uses correlation loss to approximate a multi-index function on high-dimensional standard Gaussian data. Specifically, the multi-index function we consider is a sum of neurons $f^{*} (x) = \sum_{j = 1}^{k} σ^{*} (v_{j}^{T} x)$ where $v_{1}, \dots, v_{k}$ are unit vectors, and $σ^{*}$ lacks the first and second Hermite polynomials in its Hermite expansion. It is known that, for the single-index case ( $k = 1$ ), overcoming the search phase requires polynomial time complexity. We first generalize this result to multi-index functions characterized by vectors in arbitrary directions. After the search phase, it is not clear whether the network neurons converge to the index vectors, or get stuck at a sub-optimal solution. When the index vectors are orthogonal, we give a complete characterization of the fixed points and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Air Quality Monitoring and Forecasting · Machine Learning and Data Classification

MethodsSparse Evolutionary Training