Hysteresis Activation Function for Efficient Inference
Moshe Kimhi, Idan Kashani, Avi Mendelson, Chaim Baskin

TL;DR
This paper introduces HeLU, a hysteresis-based activation function that mitigates the dying ReLU problem, improves generalization, and maintains hardware efficiency for neural network inference.
Contribution
The paper proposes HeLU, a novel hysteresis activation function that refines backpropagation and enhances model performance without added complexity.
Findings
HeLU improves model generalization across datasets.
HeLU addresses the dying ReLU problem effectively.
HeLU maintains hardware efficiency during inference.
Abstract
The widely used ReLU is favored for its hardware efficiency, {as the implementation at inference is a one bit sign case,} yet suffers from issues such as the ``dying ReLU'' problem, where during training, neurons fail to activate and constantly remain at zero, as highlighted by Lu et al. Traditional approaches to mitigate this issue often introduce more complex and less hardware-friendly activation functions. In this work, we propose a Hysteresis Rectified Linear Unit (HeLU), an efficient activation function designed to address the ``dying ReLU'' problem with minimal complexity. Unlike traditional activation functions with fixed thresholds for training and inference, HeLU employs a variable threshold that refines the backpropagation. This refined mechanism allows simpler activation functions to achieve competitive performance comparable to their more complex counterparts without…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Linear Layer · Residual Connection · Weight Decay · Attention Dropout · Linear Warmup With Linear Decay · WordPiece · Adam · Dropout · Softmax
