Neural networks with trainable matrix activation functions

Zhengqi Liu; Shuhao Cao; Yuwen Li; Ludmil Zikatanov

arXiv:2109.09948·cs.LG·October 29, 2024

Neural networks with trainable matrix activation functions

Zhengqi Liu, Shuhao Cao, Yuwen Li, Ludmil Zikatanov

PDF

Open Access

TL;DR

This paper introduces trainable matrix-valued activation functions for neural networks, enabling the activation functions to be learned during training, which enhances flexibility and robustness.

Contribution

It presents a systematic method for constructing trainable matrix activation functions based on generalized ReLU, integrated into neural networks for improved adaptability.

Findings

01

Neural networks with trainable matrix activations are simple and efficient.

02

The approach demonstrates robustness in numerical experiments.

03

Training these activations alongside weights improves model flexibility.

Abstract

The training process of neural networks usually optimize weights and bias parameters of linear transformations, while nonlinear activation functions are pre-specified and fixed. This work develops a systematic approach to constructing matrix-valued activation functions whose entries are generalized from ReLU. The activation is based on matrix-vector multiplications using only scalar multiplications and comparisons. The proposed activation functions depend on parameters that are trained along with the weights and bias vectors. Neural networks based on this approach are simple and efficient and are shown to be robust in numerical experiments.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Machine Learning and ELM