How not to Stitch Representations to Measure Similarity: Task Loss Matching versus Direct Matching
Andr\'as Balogh, M\'ark Jelasity

TL;DR
This paper critically evaluates model stitching methods for measuring neural network similarity, revealing that task loss matching can be misleading and proposing direct matching as a more reliable alternative.
Contribution
The paper identifies limitations of task loss matching and demonstrates that direct matching provides a more accurate measure of representation similarity in neural networks.
Findings
Task loss matching can indicate high similarity between distant layers.
Task loss matching may suggest non-corresponding layers are more similar.
Direct matching avoids out-of-distribution issues and aligns better with structural similarity.
Abstract
Measuring the similarity of the internal representations of deep neural networks is an important and challenging problem. Model stitching has been proposed as a possible approach, where two half-networks are connected by mapping the output of the first half-network to the input of the second one. The representations are considered functionally similar if the resulting stitched network achieves good task-specific performance. The mapping is normally created by training an affine stitching layer on the task at hand while freezing the two half-networks, a method called task loss matching. Here, we argue that task loss matching may be very misleading as a similarity index. For example, it can indicate very high similarity between very distant layers, whose representations are known to have different functional properties. Moreover, it can indicate very distant layers to be more similar than…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Human Pose and Action Recognition · Industrial Vision Systems and Defect Detection
