Sharpness-Aware Minimization with Z-Score Gradient Filtering

Vincent-Daniel Yun

arXiv:2505.02369·cs.LG·April 24, 2026

Sharpness-Aware Minimization with Z-Score Gradient Filtering

Vincent-Daniel Yun

PDF

1 Repo

TL;DR

This paper introduces Z-Score Filtered Sharpness-Aware Minimization, a gradient filtering technique that enhances neural network generalization by focusing on significant gradient components, leading to improved test accuracy.

Contribution

It proposes a novel Z-score based gradient filtering method for SAM, improving its ability to find flatter minima and enhance generalization performance.

Findings

01

Consistently improves test accuracy across datasets and architectures.

02

Reduces influence of noisy or small gradient components.

03

Effective in various neural network models like ResNet, VGG, and Vision Transformers.

Abstract

Deep neural networks achieve high performance across many domains but can still face challenges in generalization when optimization is influenced by small or noisy gradient components. Sharpness-Aware Minimization improves generalization by perturbing parameters toward directions of high curvature, but it uses the entire gradient vector, which means that small or noisy components may affect the ascent step and cause the optimizer to miss optimal solutions. We propose Z-Score Filtered Sharpness-Aware Minimization, which applies Z-score based filtering to gradients in each layer. Instead of using all gradient components, a mask is constructed to retain only the top percentile with the largest absolute Z-scores. The percentile threshold $Q_{p}$ determines how many components are kept, so that the ascent step focuses on directions that stand out most compared to the average of the layer. This…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

YUNBLAK/Sharpness-Aware-Minimization-with-Z-Score-Gradient-Filtering
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.