Hidden Synergy: $L_1$ Weight Normalization and 1-Path-Norm   Regularization

Aditya Biswas

arXiv:2404.19112·cs.LG·May 1, 2024

Hidden Synergy: $L_1$ Weight Normalization and 1-Path-Norm Regularization

Aditya Biswas

PDF

Open Access

TL;DR

This paper introduces PSiLON Net, an MLP architecture utilizing $L_1$ weight normalization and 1-path-norm regularization, simplifying analysis and promoting efficient, near-sparse learning, with extensions to residual networks and pruning methods.

Contribution

It presents a novel neural network architecture with simplified 1-path-norm regularization, a pruning method for sparsity, and a residual block design that bounds Lipschitz constants efficiently.

Findings

01

Effective regularization with 1-path-norm improves generalization.

02

Pruning achieves exact sparsity in trained models.

03

Strong performance in small data regimes with overparameterized networks.

Abstract

We present PSiLON Net, an MLP architecture that uses $L_{1}$ weight normalization for each weight vector and shares the length parameter across the layer. The 1-path-norm provides a bound for the Lipschitz constant of a neural network and reflects on its generalizability, and we show how PSiLON Net's design drastically simplifies the 1-path-norm, while providing an inductive bias towards efficient learning and near-sparse parameters. We propose a pruning method to achieve exact sparsity in the final stages of training, if desired. To exploit the inductive bias of residual networks, we present a simplified residual block, leveraging concatenated ReLU activations. For networks constructed with such blocks, we prove that considering only a subset of possible paths in the 1-path-norm is sufficient to bound the Lipschitz constant. Using the 1-path-norm and this improved bound as regularizers,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNumerical Methods and Algorithms

MethodsWeight Normalization · Pruning