SASSHA: Sharpness-aware Adaptive Second-order Optimization with Stable Hessian Approximation

Dahun Shin; Dongyeop Lee; Jinseok Chung; Namhoon Lee

arXiv:2502.18153·cs.LG·June 25, 2025

SASSHA: Sharpness-aware Adaptive Second-order Optimization with Stable Hessian Approximation

Dahun Shin, Dongyeop Lee, Jinseok Chung, Namhoon Lee

PDF

Open Access 1 Repo 1 Video

TL;DR

SASSHA is a novel second-order optimization method that improves generalization in deep learning by explicitly reducing sharpness of minima and stabilizing Hessian approximations, achieving better performance than existing methods.

Contribution

This work introduces SASSHA, a second-order optimizer that enhances generalization by minimizing sharpness and stabilizing Hessian updates, with efficiency considerations.

Findings

01

SASSHA outperforms other methods in generalization across various deep learning tasks.

02

It demonstrates comparable or superior convergence and robustness.

03

SASSHA maintains efficiency with lazy Hessian updates.

Abstract

Approximate second-order optimization methods often exhibit poorer generalization compared to first-order approaches. In this work, we look into this issue through the lens of the loss landscape and find that existing second-order methods tend to converge to sharper minima compared to SGD. In response, we propose Sassha, a novel second-order method designed to enhance generalization by explicitly reducing sharpness of the solution, while stabilizing the computation of approximate Hessians along the optimization trajectory. In fact, this sharpness minimization scheme is crafted also to accommodate lazy Hessian updates, so as to secure efficiency besides flatness. To validate its effectiveness, we conduct a wide range of standard deep learning experiments where Sassha demonstrates its outstanding generalization performance that is comparable to, and mostly better than, other methods. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

LOG-postech/Sassha
pytorchOfficial

Videos

Sassha: Sharpness-aware Adaptive Second-order Optimization with Stable Hessian Approximation· slideslive

Taxonomy

TopicsAdvanced Adaptive Filtering Techniques · Stochastic Gradient Optimization Techniques · Advanced Bandit Algorithms Research

MethodsSparse Evolutionary Training · Stochastic Gradient Descent