Navigating Potholes with Geometry-Aware Sharpness Minimization
Simon Dufort-Labb\'e, Mehrab Hamidi, Razvan Pascanu, Ioannis Mitliagkas, Damien Scieur, Aristide Baratin

TL;DR
This paper introduces LLQR+SAM, a novel optimization method that combines a learned geometry-aware preconditioner with sharpness-aware minimization to better navigate complex loss landscapes in machine learning.
Contribution
The paper proposes LLQR+SAM, integrating a learned preconditioner with SAM to adaptively account for loss landscape geometry, improving optimization performance.
Findings
LLQR+SAM outperforms SAM and LLQR individually on vision and sequence benchmarks.
The method effectively identifies and escapes sharp potholes in the loss landscape.
The two-timescale approach enhances optimization stability and convergence.
Abstract
Sharpness-aware minimization (SAM) encourages flat minima by perturbing parameters along directions of high loss curvature, but treats all parameter directions uniformly, ignoring the underlying loss geometry. We introduce LLQR+SAM, which combines SAM with a learned preconditioner obtained from the recently proposed LLQR framework, a second-order method that recasts steepest descent as a layerwise linear-quadratic regulator problem. The preconditioner is updated sparsely and maintained as a slow exponential moving average, so it captures a smoothed, low-resolution picture of the loss landscape geometry. The SAM perturbation then operates on top of this learned geometry, probing curvature at a faster timescale. We show that this two-timescale structure is not merely a computational convenience: theoretically, the preconditioner amplifies the SAM escape signal in directions that are flat…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
