Entangled Residual Mappings
Mathias Lechner, Ramin Hasani, Zahra Babaiee, Radu Grosu, Daniela Rus,, Thomas A. Henzinger, Sepp Hochreiter

TL;DR
This paper introduces entangled residual mappings, replacing identity skip connections with specialized matrices to explore their impact on deep neural network training and representation learning.
Contribution
It generalizes residual connections using entangled mappings like orthogonal and sparse matrices, analyzing their effects across different neural network architectures.
Findings
Entangled sparse mappings improve CNN and Vision Transformer generalization.
Orthogonal mappings can hinder performance in CNNs and Transformers.
Orthogonal residuals serve as an inductive bias for recurrent networks.
Abstract
Residual mappings have been shown to perform representation learning in the first layers and iterative feature refinement in higher layers. This interplay, combined with their stabilizing effect on the gradient norms, enables them to train very deep networks. In this paper, we take a step further and introduce entangled residual mappings to generalize the structure of the residual connections and evaluate their role in iterative learning representations. An entangled residual mapping replaces the identity skip connections with specialized entangled mappings such as orthogonal, sparse, and structural correlation matrices that share key attributes (eigenvalues, structure, and Jacobian norm) with identity mappings. We show that while entangled mappings can preserve the iterative refinement of features across various deep models, they influence the representation learning process in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Neural Networks and Applications · Ferroelectric and Negative Capacitance Devices
