The Asymmetric Maximum Margin Bias of Quasi-Homogeneous Neural Networks
Daniel Kunin, Atsushi Yamamura, Chao Ma, Surya Ganguli

TL;DR
This paper analyzes the maximum-margin bias in quasi-homogeneous neural networks, revealing an implicit favoritism towards certain parameters that impacts robustness and explains Neural Collapse phenomena.
Contribution
It introduces quasi-homogeneous models, extends maximum-margin bias analysis to them, and uncovers a universal mechanism behind Neural Collapse.
Findings
Gradient flow favors a subset of parameters, leading to asymmetric norm minimization.
Asymmetric norm minimization can reduce robustness of the models.
The analysis explains the Neural Collapse phenomenon in deep networks with normalization.
Abstract
In this work, we explore the maximum-margin bias of quasi-homogeneous neural networks trained with gradient flow on an exponential loss and past a point of separability. We introduce the class of quasi-homogeneous models, which is expressive enough to describe nearly all neural networks with homogeneous activations, even those with biases, residual connections, and normalization layers, while structured enough to enable geometric analysis of its gradient dynamics. Using this analysis, we generalize the existing results of maximum-margin bias for homogeneous networks to this richer class of models. We find that gradient flow implicitly favors a subset of the parameters, unlike in the case of a homogeneous model where all parameters are treated equally. We demonstrate through simple examples how this strong favoritism toward minimizing an asymmetric norm can degrade the robustness of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Model Reduction and Neural Networks · Machine Learning in Materials Science
