Tandem Blocks in Deep Convolutional Neural Networks
Chris Hettinger, Tanner Christensen, Jeffrey Humpherys, Tyler J., Jarvis

TL;DR
This paper investigates the role of linear shortcut connections in deep CNNs, showing that various linear connections can outperform identity shortcuts, with effectiveness depending on network architecture.
Contribution
It introduces and tests different linear shortcut variants in residual blocks, revealing their potential to improve CNN performance over standard identity shortcuts.
Findings
Linear shortcuts can outperform identity shortcuts in CNNs.
The effectiveness of linear connections depends on network width and depth.
Different linear connection types may be optimal for different architectures.
Abstract
Due to the success of residual networks (resnets) and related architectures, shortcut connections have quickly become standard tools for building convolutional neural networks. The explanations in the literature for the apparent effectiveness of shortcuts are varied and often contradictory. We hypothesize that shortcuts work primarily because they act as linear counterparts to nonlinear layers. We test this hypothesis by using several variations on the standard residual block, with different types of linear connections, to build small image classification networks. Our experiments show that other kinds of linear connections can be even more effective than the identity shortcuts. Our results also suggest that the best type of linear connection for a given application may depend on both network width and depth.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Anomaly Detection Techniques and Applications
