A Lightweight and Gradient-Stable Neural Layer
Yueyao Yu, Yin Zhang

TL;DR
This paper introduces the Han-layer, a neural network layer that is lightweight, parameter-efficient, and guarantees gradient stability through orthogonal Jacobians, improving resource efficiency without sacrificing performance.
Contribution
The paper proposes the Han-layer architecture based on Householder weighting and absolute-value activation, reducing parameters and ensuring gradient stability in neural networks.
Findings
Han-layer reduces parameter count from O(d^2) to O(d).
Han-layer maintains or improves generalization performance.
Han-layer guarantees orthogonal Jacobian, ensuring gradient stability.
Abstract
To enhance resource efficiency and model deployability of neural networks, we propose a neural-layer architecture based on Householder weighting and absolute-value activating, called Householder-absolute neural layer or simply Han-layer. Compared to a fully connected layer with -neurons and outputs, a Han-layer reduces the number of parameters and the corresponding computational complexity from to . {The Han-layer structure guarantees that the Jacobian of the layer function is always orthogonal, thus ensuring gradient stability (i.e., free of gradient vanishing or exploding issues) for any Han-layer sub-networks.} Extensive numerical experiments show that one can strategically use Han-layers to replace fully connected (FC) layers, reducing the number of model parameters while maintaining or even improving the generalization performance. We will also showcase the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Machine Learning and ELM
