Exploiting Elasticity in Tensor Ranks for Compressing Neural Networks
Jie Ran, Rui Lin, Hayden K.H. So, Graziano Chesi, Ngai Wong

TL;DR
This paper introduces a novel tensor rank minimization method that exploits elasticity in multiple dimensions, including input-output channels, to effectively compress neural networks while maintaining accuracy.
Contribution
It proposes NRMF, a new approach for dynamic tensor rank search during training, revealing inter-layer correlations and improving compression performance.
Findings
NRMF outperforms previous VBMF in neural network compression.
Exploiting elasticity in input-output channels enhances model size-accuracy tradeoff.
Tensor rank correlations across layers are identified and utilized.
Abstract
Elasticities in depth, width, kernel size and resolution have been explored in compressing deep neural networks (DNNs). Recognizing that the kernels in a convolutional neural network (CNN) are 4-way tensors, we further exploit a new elasticity dimension along the input-output channels. Specifically, a novel nuclear-norm rank minimization factorization (NRMF) approach is proposed to dynamically and globally search for the reduced tensor ranks during training. Correlation between tensor ranks across multiple layers is revealed, and a graceful tradeoff between model size and accuracy is obtained. Experiments then show the superiority of NRMF over the previous non-elastic variational Bayesian matrix factorization (VBMF) scheme.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTensor decomposition and applications · Advanced Neural Network Applications · Medical Image Segmentation Techniques
