Forget Sharpness: Perturbed Forgetting of Model Biases Within SAM   Dynamics

Ankit Vani; Frederick Tung; Gabriel L. Oliveira; Hossein; Sharifi-Noghabi

arXiv:2406.06700·cs.LG·June 12, 2024

Forget Sharpness: Perturbed Forgetting of Model Biases Within SAM Dynamics

Ankit Vani, Frederick Tung, Gabriel L. Oliveira, Hossein, Sharifi-Noghabi

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new perspective on SAM's training dynamics, proposing that perturbed forgetting of model biases explains its generalization benefits better than sharpness minimization.

Contribution

The paper presents a novel view of SAM's mechanism, linking perturbed forgetting to improved generalization and proposing output bias targeting perturbations that outperform standard methods.

Findings

01

Perturbed forgetting correlates more strongly with generalization than sharpness.

02

Output bias targeting perturbations outperform standard SAM and variants on benchmarks.

03

SAM benefits can be explained without relying on loss surface flatness.

Abstract

Despite attaining high empirical generalization, the sharpness of models trained with sharpness-aware minimization (SAM) do not always correlate with generalization error. Instead of viewing SAM as minimizing sharpness to improve generalization, our paper considers a new perspective based on SAM's training dynamics. We propose that perturbations in SAM perform perturbed forgetting, where they discard undesirable model biases to exhibit learning signals that generalize better. We relate our notion of forgetting to the information bottleneck principle, use it to explain observations like the better generalization of smaller perturbation batches, and show that perturbed forgetting can exhibit a stronger correlation with generalization than flatness. While standard SAM targets model biases exposed by the steepest ascent directions, we propose a new perturbation that targets biases exposed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

borealisai/perturbed-forgetting
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Simulation Techniques and Applications · Reservoir Engineering and Simulation Methods

MethodsSharpness-Aware Minimization · Segment Anything Model