A Lightweight and Gradient-Stable Neural Layer

Yueyao Yu; Yin Zhang

arXiv:2106.04088·cs.LG·March 27, 2024

A Lightweight and Gradient-Stable Neural Layer

Yueyao Yu, Yin Zhang

PDF

Open Access 1 Repo

TL;DR

This paper introduces the Han-layer, a neural network layer that is lightweight, parameter-efficient, and guarantees gradient stability through orthogonal Jacobians, improving resource efficiency without sacrificing performance.

Contribution

The paper proposes the Han-layer architecture based on Householder weighting and absolute-value activation, reducing parameters and ensuring gradient stability in neural networks.

Findings

01

Han-layer reduces parameter count from O(d^2) to O(d).

02

Han-layer maintains or improves generalization performance.

03

Han-layer guarantees orthogonal Jacobian, ensuring gradient stability.

Abstract

To enhance resource efficiency and model deployability of neural networks, we propose a neural-layer architecture based on Householder weighting and absolute-value activating, called Householder-absolute neural layer or simply Han-layer. Compared to a fully connected layer with $d$ -neurons and $d$ outputs, a Han-layer reduces the number of parameters and the corresponding computational complexity from $O (d^{2})$ to $O (d)$ . {The Han-layer structure guarantees that the Jacobian of the layer function is always orthogonal, thus ensuring gradient stability (i.e., free of gradient vanishing or exploding issues) for any Han-layer sub-networks.} Extensive numerical experiments show that one can strategically use Han-layers to replace fully connected (FC) layers, reducing the number of model parameters while maintaining or even improving the generalization performance. We will also showcase the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yyy32/hannet
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Machine Learning and ELM