DC is all you need: describing ReLU from a signal processing standpoint
Christodoulos Kechris, Jonathan Dan, Jose Miranda, David Atienza

TL;DR
This paper analyzes ReLU activation functions in the frequency domain, revealing how they introduce high-frequency oscillations and a DC component that aids feature extraction and model convergence.
Contribution
It provides a novel spectral analysis of ReLU using Taylor expansion, linking its frequency behavior to practical benefits in CNN training.
Findings
ReLU introduces higher frequency oscillations and a DC component.
The DC component helps CNNs extract meaningful features.
ReLU's spectral behavior influences model convergence and weight initialization.
Abstract
Non-linear activation functions are crucial in Convolutional Neural Networks. However, until now they have not been well described in the frequency domain. In this work, we study the spectral behavior of ReLU, a popular activation function. We use the ReLU's Taylor expansion to derive its frequency domain behavior. We demonstrate that ReLU introduces higher frequency oscillations in the signal and a constant DC component. Furthermore, we investigate the importance of this DC component, where we demonstrate that it helps the model extract meaningful features related to the input frequency content. We accompany our theoretical derivations with experiments and real-world examples. First, we numerically validate our frequency response model. Then we observe ReLU's spectral behavior on two example models and a real-world one. Finally, we experimentally investigate the role of the DC…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSensor Technology and Measurement Systems
Methods*Communicated@Fast*How Do I Communicate to Expedia?
