Linear Oscillation: A Novel Activation Function for Vision Transformer
Juyoung Yun

TL;DR
The paper introduces Linear Oscillation (LoC), a novel activation function blending linearity with oscillations, which improves neural network performance, especially in Vision Transformers, by fostering robust learning through controlled confusion.
Contribution
It proposes the LoC activation function that combines linear and oscillatory behaviors, demonstrating its superiority over traditional functions in vision transformer models.
Findings
LoC outperforms ReLU and Sigmoid in various architectures.
Integration of LoC enhances Vision Transformer performance.
Controlled confusion via LoC promotes more nuanced learning.
Abstract
Activation functions are the linchpins of deep learning, profoundly influencing both the representational capacity and training dynamics of neural networks. They shape not only the nature of representations but also optimize convergence rates and enhance generalization potential. Appreciating this critical role, we present the Linear Oscillation (LoC) activation function, defined as . Distinct from conventional activation functions which primarily introduce non-linearity, LoC seamlessly blends linear trajectories with oscillatory deviations. The nomenclature "Linear Oscillation" is a nod to its unique attribute of infusing linear activations with harmonious oscillations, capturing the essence of the "Importance of Confusion". This concept of "controlled confusion" within network activations is posited to foster more robust learning, particularly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · CCD and CMOS Imaging Sensors · Infrared Target Detection Methodologies
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Softmax · Dense Connections · Layer Normalization · Dropout · Byte Pair Encoding · Adam · Position-Wise Feed-Forward Layer
