Exploiting Elasticity in Tensor Ranks for Compressing Neural Networks

Jie Ran; Rui Lin; Hayden K.H. So; Graziano Chesi; Ngai Wong

arXiv:2105.04218·cs.LG·May 11, 2021

Exploiting Elasticity in Tensor Ranks for Compressing Neural Networks

Jie Ran, Rui Lin, Hayden K.H. So, Graziano Chesi, Ngai Wong

PDF

Open Access

TL;DR

This paper introduces a novel tensor rank minimization method that exploits elasticity in multiple dimensions, including input-output channels, to effectively compress neural networks while maintaining accuracy.

Contribution

It proposes NRMF, a new approach for dynamic tensor rank search during training, revealing inter-layer correlations and improving compression performance.

Findings

01

NRMF outperforms previous VBMF in neural network compression.

02

Exploiting elasticity in input-output channels enhances model size-accuracy tradeoff.

03

Tensor rank correlations across layers are identified and utilized.

Abstract

Elasticities in depth, width, kernel size and resolution have been explored in compressing deep neural networks (DNNs). Recognizing that the kernels in a convolutional neural network (CNN) are 4-way tensors, we further exploit a new elasticity dimension along the input-output channels. Specifically, a novel nuclear-norm rank minimization factorization (NRMF) approach is proposed to dynamically and globally search for the reduced tensor ranks during training. Correlation between tensor ranks across multiple layers is revealed, and a graceful tradeoff between model size and accuracy is obtained. Experiments then show the superiority of NRMF over the previous non-elastic variational Bayesian matrix factorization (VBMF) scheme.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTensor decomposition and applications · Advanced Neural Network Applications · Medical Image Segmentation Techniques