Behind the Scenes of Gradient Descent: A Trajectory Analysis via Basis Function Decomposition
Jianhao Ma, Lingjun Guo, Salar Fattahi

TL;DR
This paper introduces a basis function decomposition to analyze gradient descent trajectories, revealing monotonic behavior in the coefficients and improving convergence results for matrix and tensor factorizations, with empirical validation on deep neural networks.
Contribution
It proposes a novel basis function decomposition framework for analyzing gradient descent trajectories, leading to new convergence results and insights into neural network training.
Findings
Gradient descent trajectories are nearly monotonic when projected onto an appropriate basis.
The framework improves convergence analysis for symmetric matrix factorization.
Empirical results show monotonic learning of basis coefficients in deep neural networks.
Abstract
This work analyzes the solution trajectory of gradient-based algorithms via a novel basis function decomposition. We show that, although solution trajectories of gradient-based algorithms may vary depending on the learning task, they behave almost monotonically when projected onto an appropriate orthonormal function basis. Such projection gives rise to a basis function decomposition of the solution trajectory. Theoretically, we use our proposed basis function decomposition to establish the convergence of gradient descent (GD) on several representative learning tasks. In particular, we improve the convergence of GD on symmetric matrix factorization and provide a completely new convergence result for the orthogonal symmetric tensor decomposition. Empirically, we illustrate the promise of our proposed framework on realistic deep neural networks (DNNs) across different architectures,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTensor decomposition and applications · Model Reduction and Neural Networks · Advanced Neuroimaging Techniques and Applications
