When Are Solutions Connected in Deep Networks?
Quynh Nguyen, Pierre Brechet, Marco Mondelli

TL;DR
This paper investigates the conditions under which solutions in deep networks are connected, improving theoretical understanding by relaxing previous assumptions and demonstrating that feature quality and linear separability suffice for connectivity.
Contribution
It introduces milder over-parameterization conditions and feature quality assumptions that guarantee solution connectivity in deep networks, extending prior theoretical results.
Findings
Connectivity holds under milder conditions than previous theories.
Feature quality at intermediate layers is crucial for solution connectivity.
Experimental results confirm theoretical predictions in practical settings.
Abstract
The question of how and why the phenomenon of mode connectivity occurs in training deep neural networks has gained remarkable attention in the research community. From a theoretical perspective, two possible explanations have been proposed: (i) the loss function has connected sublevel sets, and (ii) the solutions found by stochastic gradient descent are dropout stable. While these explanations provide insights into the phenomenon, their assumptions are not always satisfied in practice. In particular, the first approach requires the network to have one layer with order of neurons ( being the number of training samples), while the second one requires the loss to be almost invariant after removing half of the neurons at each layer (up to some rescaling of the remaining ones). In this work, we improve both conditions by exploiting the quality of the features at every intermediate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Machine Learning and ELM · Domain Adaptation and Few-Shot Learning
MethodsDropout
