Function-Space Learning Rates
Edward Milsom, Ben Anson, Laurence Aitchison

TL;DR
This paper introduces a method to measure and utilize function-space learning rates in neural networks, enabling better analysis of optimizer dynamics and facilitating hyperparameter transfer across different model scales.
Contribution
It develops efficient techniques for measuring function-space learning rates and proposes FLeRM, a novel method for hyperparameter transfer across model sizes using function-space insights.
Findings
FLeRM effectively transfers hyperparameters across model scales.
Function-space learning rates provide new insights into optimizer dynamics.
Method requires minimal additional computation during training.
Abstract
We consider layerwise function-space learning rates, which measure the magnitude of the change in a neural network's output function in response to an update to a parameter tensor. This contrasts with traditional learning rates, which describe the magnitude of changes in parameter space. We develop efficient methods to measure and set function-space learning rates in arbitrary neural networks, requiring only minimal computational overhead through a few additional backward passes that can be performed at the start of, or periodically during, training. We demonstrate two key applications: (1) analysing the dynamics of standard neural network optimisers in function space, rather than parameter space, and (2) introducing FLeRM (Function-space Learning Rate Matching), a novel approach to hyperparameter transfer across model scales. FLeRM records function-space learning rates while training a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsSparse Evolutionary Training · Balanced Selection
