A Scalable Measure of Loss Landscape Curvature for Analyzing the Training Dynamics of LLMs
Dayal Singh Kalra, Jean-Christophe Gagnon-Audet, Andrey Gromov, Ishita Mediratta, Kelvin Niu, Alexander H Miller, Michael Shvartsman

TL;DR
This paper introduces a scalable, efficient measure called critical sharpness to analyze the loss landscape curvature of large language models, revealing key training phenomena and guiding data strategies at scale.
Contribution
The authors propose a novel, computationally efficient curvature measure for large models, enabling large-scale analysis of training dynamics and landscape phenomena.
Findings
Critical sharpness captures known sharpness phenomena at scale.
Demonstrates sharpness phenomena up to 7B parameters.
Guides data mixing strategies based on curvature analysis.
Abstract
Understanding the curvature evolution of the loss landscape is fundamental to analyzing the training dynamics of neural networks. The most commonly studied measure, Hessian sharpness () -- the largest eigenvalue of the loss Hessian -- determines local training stability and interacts with the learning rate throughout training. Despite its significance in analyzing training dynamics, direct measurement of Hessian sharpness remains prohibitive for Large Language Models (LLMs) due to high computational cost. We analyze (), a computationally efficient measure requiring fewer than forward passes given the update direction . Critically, this measure captures well-documented Hessian sharpness phenomena, including progressive sharpening and Edge of Stability. Using this measure, we provide the first…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Generative Adversarial Networks and Image Synthesis · Machine Learning in Materials Science
