Deep Convolutional Neural Networks with Merge-and-Run Mappings
Liming Zhao, Jingdong Wang, Xi Li, Zhuowen Tu, Wenjun Zeng

TL;DR
This paper introduces deep merge-and-run neural networks with a novel modular block that enhances information flow and training ease, showing improved performance over traditional residual networks on standard recognition benchmarks.
Contribution
The paper proposes a new merge-and-run block architecture that improves information flow and training efficiency in deep neural networks, outperforming residual networks on recognition tasks.
Findings
Achieves 3.57% error on CIFAR-10
Achieves 19.00% error on CIFAR-100
Achieves 1.51% error on SVHN
Abstract
A deep residual network, built by stacking a sequence of residual blocks, is easy to train, because identity mappings skip residual branches and thus improve information flow. To further reduce the training difficulty, we present a simple network architecture, deep merge-and-run neural networks. The novelty lies in a modularized building block, merge-and-run block, which assembles residual branches in parallel through a merge-and-run mapping: Average the inputs of these residual branches (Merge), and add the average to the output of each residual branch as the input of the subsequent residual branch (Run), respectively. We show that the merge-and-run mapping is a linear idempotent function in which the transformation matrix is idempotent, and thus improves information flow, making training easy. In comparison to residual networks, our networks enjoy compelling advantages: they contain…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning
