Gradient Descent as Loss Landscape Navigation: a Normative Framework for Deriving Learning Rules
John J. Vastola, Samuel J. Gershman, Kanaka Rajan

TL;DR
This paper introduces a theoretical framework that models learning rules as policies for navigating loss landscapes, unifying various algorithms and providing a foundation for designing adaptive learning methods.
Contribution
It presents a normative framework that derives learning rules as solutions to an optimal control problem, connecting many existing algorithms under a single theoretical perspective.
Findings
Gradient descent emerges from short-horizon optimization.
Momentum corresponds to longer-horizon planning.
Adaptive optimizers like Adam relate to Bayesian inference.
Abstract
Learning rules -- prescriptions for updating model parameters to improve performance -- are typically assumed rather than derived. Why do some learning rules work better than others, and under what assumptions can a given rule be considered optimal? We propose a theoretical framework that casts learning rules as policies for navigating (partially observable) loss landscapes, and identifies optimal rules as solutions to an associated optimal control problem. A range of well-known rules emerge naturally within this framework under different assumptions: gradient descent from short-horizon optimization, momentum from longer-horizon planning, natural gradients from accounting for parameter space geometry, non-gradient rules from partial controllability, and adaptive optimizers like Adam from online Bayesian inference of loss landscape shape. We further show that continual learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Reinforcement Learning in Robotics · AI-based Problem Solving and Planning
