Predictable Gradient Manifolds in Deep Learning: Temporal Path-Length and Intrinsic Rank as a Complexity Regime
Anherutowa Calvo

TL;DR
This paper introduces a measurable framework for understanding the structure of gradient trajectories in deep learning, revealing they are predictable and low-dimensional over time, which impacts optimization and algorithm design.
Contribution
It formalizes the concept of predictable gradient manifolds using path length and intrinsic rank, linking these properties to optimization guarantees and providing tools for empirical diagnosis.
Findings
Gradients are locally predictable across architectures.
Gradient trajectories exhibit low-rank structure over time.
Properties are stable and diagnosable from logged gradients.
Abstract
Deep learning optimization exhibits structure that is not captured by worst-case gradient bounds. Empirically, gradients along training trajectories are often temporally predictable and evolve within a low-dimensional subspace. In this work we formalize this observation through a measurable framework for predictable gradient manifolds. We introduce two computable quantities: a prediction-based path length that measures how well gradients can be forecast from past information, and a predictable rank that quantifies the intrinsic temporal dimension of gradient increments. We show how classical online and nonconvex optimization guarantees can be restated so that convergence and regret depend explicitly on these quantities, rather than on worst-case variation. Across convolutional networks, vision transformers, language models, and synthetic control tasks, we find that gradient…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Advanced Bandit Algorithms Research · Sparse and Compressive Sensing Techniques
