Learning Sparse Neural Networks with Identity Layers

Mingjian Ni; Guangyao Chen; Xiawu Zheng; Peixi Peng; Li Yuan; Yonghong; Tian

arXiv:2307.07389·cs.LG·July 17, 2023

Learning Sparse Neural Networks with Identity Layers

Mingjian Ni, Guangyao Chen, Xiawu Zheng, Peixi Peng, Li Yuan, Yonghong, Tian

PDF

Open Access

TL;DR

This paper explores the relationship between interlayer feature similarity and network sparsity, proposing a CKA-based regularization method that enhances sparsity and performance in neural networks.

Contribution

It introduces a novel CKA-based regularization technique that reduces feature similarity between layers, promoting sparsity and improving existing sparse training methods.

Findings

01

CKA-SR improves sparsity in neural networks.

02

Reducing feature similarity enhances network performance.

03

Method is effective at extremely high sparsity levels.

Abstract

The sparsity of Deep Neural Networks is well investigated to maximize the performance and reduce the size of overparameterized networks as possible. Existing methods focus on pruning parameters in the training process by using thresholds and metrics. Meanwhile, feature similarity between different layers has not been discussed sufficiently before, which could be rigorously proved to be highly correlated to the network sparsity in this paper. Inspired by interlayer feature similarity in overparameterized models, we investigate the intrinsic link between network sparsity and interlayer feature similarity. Specifically, we prove that reducing interlayer feature similarity based on Centered Kernel Alignment (CKA) improves the sparsity of the network by using information bottleneck theory. Applying such theory, we propose a plug-and-play CKA-based Sparsity Regularization for sparse network…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and ELM · Face and Expression Recognition

MethodsPruning · Focus