Mode Combinability: Exploring Convex Combinations of Permutation Aligned   Models

Adri\'an Csisz\'arik; Melinda F. Kiss; P\'eter K\H{o}r\"osi-Szab\'o,; M\'arton Muntag; Gergely Papp; D\'aniel Varga

arXiv:2308.11511·cs.LG·August 23, 2023·1 cites

Mode Combinability: Exploring Convex Combinations of Permutation Aligned Models

Adri\'an Csisz\'arik, Melinda F. Kiss, P\'eter K\H{o}r\"osi-Szab\'o,, M\'arton Muntag, Gergely Papp, D\'aniel Varga

PDF

Open Access

TL;DR

This paper investigates the concept of mode combinability in neural networks, showing that convex combinations of permutation-aligned models often result in low-loss models, extending the idea of linear mode connectivity.

Contribution

It introduces the notion of mode combinability, demonstrating that convex combinations of permutation-aligned models can maintain low loss and revealing properties like transitivity and robustness.

Findings

01

Convex combinations of models often lie on low-loss surfaces.

02

Mode combinability extends linear mode connectivity.

03

Model combinations retain functional differences despite perturbations.

Abstract

We explore element-wise convex combinations of two permutation-aligned neural network parameter vectors $Θ_{A}$ and $Θ_{B}$ of size $d$ . We conduct extensive experiments by examining various distributions of such model combinations parametrized by elements of the hypercube $[0, 1]^{d}$ and its vicinity. Our findings reveal that broad regions of the hypercube form surfaces of low loss values, indicating that the notion of linear mode connectivity extends to a more general phenomenon which we call mode combinability. We also make several novel observations regarding linear mode connectivity and model re-basin. We demonstrate a transitivity property: two models re-based to a common third model are also linear mode connected, and a robustness property: even with significant perturbations of the neuron matchings the resulting combinations continue to form a working model. Moreover, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Adversarial Robustness in Machine Learning · Brain Tumor Detection and Classification