Rethink ReLU to Training Better CNNs

Gangming Zhao; Zhaoxiang Zhang; He Guan; Peng Tang; Jingdong Wang

arXiv:1709.06247·cs.CV·September 3, 2018·6 cites

Rethink ReLU to Training Better CNNs

Gangming Zhao, Zhaoxiang Zhang, He Guan, Peng Tang, Jingdong Wang

PDF

Open Access

TL;DR

This paper proposes a proportional module that adjusts the ratio of convolution to ReLU layers in CNNs, leading to improved generalization and performance across various architectures and benchmarks.

Contribution

It introduces a proportional module to optimize the convolution-to-ReLU ratio, enhancing CNN performance without extra computational cost.

Findings

01

Improved accuracy on multiple benchmarks

02

Better generalization ability in CNNs

03

Applicable to various network architectures

Abstract

Most of convolutional neural networks share the same characteristic: each convolutional layer is followed by a nonlinear activation layer where Rectified Linear Unit (ReLU) is the most widely used. In this paper, we argue that the designed structure with the equal ratio between these two layers may not be the best choice since it could result in the poor generalization ability. Thus, we try to investigate a more suitable method on using ReLU to explore the better network architectures. Specifically, we propose a proportional module to keep the ratio between convolution and ReLU amount to be N:M (N>M). The proportional module can be applied in almost all networks with no extra computational cost to improve the performance. Comprehensive experimental results indicate that the proposed method achieves better performance on different benchmarks with different network architectures, thus…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Neural Networks and Applications · Domain Adaptation and Few-Shot Learning

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Convolution