The Empirical Impact of Neural Parameter Symmetries, or Lack Thereof

Derek Lim; Theo Moe Putterman; Robin Walters; Haggai Maron; Stefanie; Jegelka

arXiv:2405.20231·cs.LG·October 16, 2024

The Empirical Impact of Neural Parameter Symmetries, or Lack Thereof

Derek Lim, Theo Moe Putterman, Robin Walters, Haggai Maron, Stefanie, Jegelka

PDF

Open Access 1 Repo

TL;DR

This paper empirically investigates how reducing neural parameter symmetries affects deep learning phenomena, revealing insights into mode connectivity and Bayesian training efficiency through new architectures with fewer symmetries.

Contribution

Introduces methods to modify neural networks to reduce parameter symmetries and conducts comprehensive experiments to analyze their impact on deep learning behaviors.

Findings

01

Linear mode connectivity without weight space alignment

02

Faster Bayesian neural network training

03

Reduced parameter symmetries influence optimization landscapes

Abstract

Many algorithms and observed phenomena in deep learning appear to be affected by parameter symmetries -- transformations of neural network parameters that do not change the underlying neural network function. These include linear mode connectivity, model merging, Bayesian neural network inference, metanetworks, and several other characteristics of optimization or loss-landscapes. However, theoretical analysis of the relationship between parameter space symmetries and these phenomena is difficult. In this work, we empirically investigate the impact of neural parameter symmetries by introducing new neural network architectures that have reduced parameter space symmetries. We develop two methods, with some provable guarantees, of modifying standard neural networks to reduce parameter space symmetries. With these new methods, we conduct a comprehensive experimental study consisting of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cptq/asymmetric-networks
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications