Mode Combinability: Exploring Convex Combinations of Permutation Aligned Models
Adri\'an Csisz\'arik, Melinda F. Kiss, P\'eter K\H{o}r\"osi-Szab\'o,, M\'arton Muntag, Gergely Papp, D\'aniel Varga

TL;DR
This paper investigates the concept of mode combinability in neural networks, showing that convex combinations of permutation-aligned models often result in low-loss models, extending the idea of linear mode connectivity.
Contribution
It introduces the notion of mode combinability, demonstrating that convex combinations of permutation-aligned models can maintain low loss and revealing properties like transitivity and robustness.
Findings
Convex combinations of models often lie on low-loss surfaces.
Mode combinability extends linear mode connectivity.
Model combinations retain functional differences despite perturbations.
Abstract
We explore element-wise convex combinations of two permutation-aligned neural network parameter vectors and of size . We conduct extensive experiments by examining various distributions of such model combinations parametrized by elements of the hypercube and its vicinity. Our findings reveal that broad regions of the hypercube form surfaces of low loss values, indicating that the notion of linear mode connectivity extends to a more general phenomenon which we call mode combinability. We also make several novel observations regarding linear mode connectivity and model re-basin. We demonstrate a transitivity property: two models re-based to a common third model are also linear mode connected, and a robustness property: even with significant perturbations of the neuron matchings the resulting combinations continue to form a working model. Moreover, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Adversarial Robustness in Machine Learning · Brain Tumor Detection and Classification
