Understanding SAM's Robustness to Noisy Labels through Gradient Down-weighting
Hoang-Chau Luong, Quang-Thuc Nguyen, Dat Ba Tran, and Minh-Triet Tran

TL;DR
This paper analyzes how SAM's gradient amplification mechanism contributes to robustness against noisy labels and introduces SANER, a variant that enhances this effect, improving generalization in noisy label scenarios.
Contribution
The paper provides a new element-wise explanation for SAM's robustness and proposes SANER, a simple reweighting method that further reduces noisy label memorization.
Findings
SANER significantly reduces noisy-label memorization.
SANER improves generalization over SAM and SGD on noisy datasets.
SANER can be integrated into other SAM-like methods for enhanced robustness.
Abstract
Sharpness-Aware Minimization (SAM) was introduced to improve generalization by seeking flat minima, yet it also exhibits robustness to label noise, a phenomenon that remains only partially understood. Prior work has mainly attributed this effect to SAM's tendency to prolong the learning of clean samples. In this work, we provide a complementary explanation by analyzing SAM at the element-wise level. We show that when noisy gradients dominate a parameter direction, their influence is reduced by the stronger amplification of clean gradients. This slows the memorization of noisy labels while sustaining clean learning, offering a more complete account of SAM's robustness. Building on this insight, we propose SANER (Sharpness-Aware Noise-Explicit Reweighting), a simple variant of SAM that explicitly magnifies this down-weighting effect. Experiments on benchmark image classification tasks…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
