Input Space Mode Connectivity in Deep Neural Networks
Jakub Vrabel, Ori Shem-Ur, Yaron Oz, David Krueger

TL;DR
This paper explores the existence of low-loss paths connecting similar input images in deep neural networks, extending the concept of mode connectivity from parameter space to input space, with implications for adversarial detection and interpretability.
Contribution
It introduces the concept of input space mode connectivity, providing theoretical and empirical evidence, and demonstrates its applications in adversarial detection and interpretability.
Findings
Connected images with similar predictions are common in trained models.
Paths between such images are often nearly linear with small deviations.
Input space mode connectivity exists even in untrained models, explained by percolation theory.
Abstract
We extend the concept of loss landscape mode connectivity to the input space of deep neural networks. Mode connectivity was originally studied within parameter space, where it describes the existence of low-loss paths between different solutions (loss minimizers) obtained through gradient descent. We present theoretical and empirical evidence of its presence in the input space of deep networks, thereby highlighting the broader nature of the phenomenon. We observe that different input images with similar predictions are generally connected, and for trained models, the path tends to be simple, with only a small deviation from being a linear path. Our methodology utilizes real, interpolated, and synthetic inputs created using the input optimization technique for feature visualization. We conjecture that input space mode connectivity in high-dimensional spaces is a geometric effect that…
Peer Reviews
Decision·ICLR 2025 Poster
1. This topic is interesting. Investigating the model connectivity in the input space could help us to shape the decision boundary of DNNs. 2. The insight of mode connectivity is indeed an intrinsic property of high-dimensional geometry is important, as it might be able to explain various phenomena lied in the field of model connectivity, such as wide neural networks are easier to satisfy mode connectivity after accounting for permutation invariance. 3. The potential application towards adversar
1. The major issue of this work is that the investigation is not in-depth enough. For example, in Fig. 3, the path A->B'->C and A->B->C look similar but they differ significantly in terms of mode connectivity. - How should we quantify such differences? or why the small difference B'-B (as shown in right bottom of Fig. 1) is significant in terms of model connectivity? - Here is another example, in the adversarial example part, why real-adversarial pair shows a large barrier than real-re
1. The role of the input when it comes to loss behaviour is somewhat understudied and the authors develop new ideas in this direction while keeping things very analagous to the results observed for parameter loss landscapes. 2. The authors give further credibility to their results by mathematically proving them in an idealized setting assuming independence. While this is not realistic, I do find the argument of the authors convincing that correlations in this case will most likely help connectiv
1. The biggest weakness is the lack of motivation presented in this paper for input space connectivity. Why is this an interesting quantity to study? In case of the parameter loss landscape where this notion originated, the motivation was from an optimisation point of view; is SGD attracted to a convex region of the parameter space? Does it find isolated minima or are there entire regions of low loss? There might be good motivations for input space connectivity as well (I’m not an expert in thi
- This phenomenon is interesting and potentially valuable for understanding the behaviour of deep neural networks and the geometry of high-dimensional spaces. - This finding has a practical useness that it can be used to detect adversarial attacks.
- The presentation is kind of confusing. See Questions for concrete issues. - In my understanding, the finding in this paper is not really linear connectivity as the path found by the proposed method is actually a piecewise lienear path with 2 pieces. This makes the title and introduction kind of misleading. - The experiments are only on a few image classification tasks and models. It is not clear if this phenomenon is general enough. - It seems that the thoretical explanation presented in Sec
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Reservoir Computing · Seismology and Earthquake Studies · Earthquake Detection and Analysis
