Neural networks with trainable matrix activation functions
Zhengqi Liu, Shuhao Cao, Yuwen Li, Ludmil Zikatanov

TL;DR
This paper introduces trainable matrix-valued activation functions for neural networks, enabling the activation functions to be learned during training, which enhances flexibility and robustness.
Contribution
It presents a systematic method for constructing trainable matrix activation functions based on generalized ReLU, integrated into neural networks for improved adaptability.
Findings
Neural networks with trainable matrix activations are simple and efficient.
The approach demonstrates robustness in numerical experiments.
Training these activations alongside weights improves model flexibility.
Abstract
The training process of neural networks usually optimize weights and bias parameters of linear transformations, while nonlinear activation functions are pre-specified and fixed. This work develops a systematic approach to constructing matrix-valued activation functions whose entries are generalized from ReLU. The activation is based on matrix-vector multiplications using only scalar multiplications and comparisons. The proposed activation functions depend on parameters that are trained along with the weights and bias vectors. Neural networks based on this approach are simple and efficient and are shown to be robust in numerical experiments.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Machine Learning and ELM
