Navigating Potholes with Geometry-Aware Sharpness Minimization

Simon Dufort-Labb\'e; Mehrab Hamidi; Razvan Pascanu; Ioannis Mitliagkas; Damien Scieur; Aristide Baratin

arXiv:2605.16134·cs.LG·May 18, 2026

Navigating Potholes with Geometry-Aware Sharpness Minimization

Simon Dufort-Labb\'e, Mehrab Hamidi, Razvan Pascanu, Ioannis Mitliagkas, Damien Scieur, Aristide Baratin

PDF

TL;DR

This paper introduces LLQR+SAM, a novel optimization method that combines a learned geometry-aware preconditioner with sharpness-aware minimization to better navigate complex loss landscapes in machine learning.

Contribution

The paper proposes LLQR+SAM, integrating a learned preconditioner with SAM to adaptively account for loss landscape geometry, improving optimization performance.

Findings

01

LLQR+SAM outperforms SAM and LLQR individually on vision and sequence benchmarks.

02

The method effectively identifies and escapes sharp potholes in the loss landscape.

03

The two-timescale approach enhances optimization stability and convergence.

Abstract

Sharpness-aware minimization (SAM) encourages flat minima by perturbing parameters along directions of high loss curvature, but treats all parameter directions uniformly, ignoring the underlying loss geometry. We introduce LLQR+SAM, which combines SAM with a learned preconditioner obtained from the recently proposed LLQR framework, a second-order method that recasts steepest descent as a layerwise linear-quadratic regulator problem. The preconditioner is updated sparsely and maintained as a slow exponential moving average, so it captures a smoothed, low-resolution picture of the loss landscape geometry. The SAM perturbation then operates on top of this learned geometry, probing curvature at a faster timescale. We show that this two-timescale structure is not merely a computational convenience: theoretically, the preconditioner amplifies the SAM escape signal in directions that are flat…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.