Sharpness-Aware Minimization Leads to Low-Rank Features

Maksym Andriushchenko; Dara Bahri; Hossein Mobahi; Nicolas Flammarion

arXiv:2305.16292·cs.LG·October 31, 2023·1 cites

Sharpness-Aware Minimization Leads to Low-Rank Features

Maksym Andriushchenko, Dara Bahri, Hossein Mobahi, Nicolas Flammarion

PDF

Open Access 1 Video

TL;DR

This paper reveals that Sharpness-Aware Minimization (SAM) not only improves generalization but also reduces feature rank across various neural network architectures and tasks, with a mechanistic explanation provided for simple models.

Contribution

It uncovers the low-rank feature reduction effect of SAM across diverse models and tasks, and offers a mechanistic understanding of this phenomenon.

Findings

01

SAM reduces feature rank in neural networks.

02

Low-rank effect occurs broadly across architectures and objectives.

03

Activation pruning by SAM contributes to rank reduction.

Abstract

Sharpness-aware minimization (SAM) is a recently proposed method that minimizes the sharpness of the training loss of a neural network. While its generalization improvement is well-known and is the primary motivation, we uncover an additional intriguing effect of SAM: reduction of the feature rank which happens at different layers of a neural network. We show that this low-rank effect occurs very broadly: for different architectures such as fully-connected networks, convolutional networks, vision transformers and for different objectives such as regression, classification, language-image contrastive training. To better understand this phenomenon, we provide a mechanistic understanding of how low-rank features arise in a simple two-layer network. We observe that a significant number of activations gets entirely pruned by SAM which directly contributes to the rank reduction. We confirm…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Sharpness-Aware Minimization Leads to Low-Rank Features· slideslive

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Machine Learning and ELM