When Are Solutions Connected in Deep Networks?

Quynh Nguyen; Pierre Brechet; Marco Mondelli

arXiv:2102.09671·cs.LG·October 22, 2021

When Are Solutions Connected in Deep Networks?

Quynh Nguyen, Pierre Brechet, Marco Mondelli

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper investigates the conditions under which solutions in deep networks are connected, improving theoretical understanding by relaxing previous assumptions and demonstrating that feature quality and linear separability suffice for connectivity.

Contribution

It introduces milder over-parameterization conditions and feature quality assumptions that guarantee solution connectivity in deep networks, extending prior theoretical results.

Findings

01

Connectivity holds under milder conditions than previous theories.

02

Feature quality at intermediate layers is crucial for solution connectivity.

03

Experimental results confirm theoretical predictions in practical settings.

Abstract

The question of how and why the phenomenon of mode connectivity occurs in training deep neural networks has gained remarkable attention in the research community. From a theoretical perspective, two possible explanations have been proposed: (i) the loss function has connected sublevel sets, and (ii) the solutions found by stochastic gradient descent are dropout stable. While these explanations provide insights into the phenomenon, their assumptions are not always satisfied in practice. In particular, the first approach requires the network to have one layer with order of $N$ neurons ( $N$ being the number of training samples), while the second one requires the loss to be almost invariant after removing half of the neurons at each layer (up to some rescaling of the remaining ones). In this work, we improve both conditions by exploiting the quality of the features at every intermediate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

modeconnectivity/modeconnectivity
pytorchOfficial

Videos

When Are Solutions Connected in Deep Networks?· slideslive

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Machine Learning and ELM · Domain Adaptation and Few-Shot Learning

MethodsDropout