Tensor Decomposition for Model Reduction in Neural Networks: A Review
Xingyi Liu, Keshab K. Parhi

TL;DR
This review discusses tensor decomposition techniques for reducing the size and computational cost of neural networks, highlighting their effectiveness in model compression and deployment on edge devices.
Contribution
It provides a comprehensive review of six tensor decomposition methods and their application to various neural network architectures, emphasizing their benefits for model efficiency.
Findings
Tensor decompositions can significantly compress CNNs, RNNs, and Transformers.
Compressed models sometimes outperform original models in accuracy.
Tensor methods reduce model size, runtime, and energy consumption.
Abstract
Modern neural networks have revolutionized the fields of computer vision (CV) and Natural Language Processing (NLP). They are widely used for solving complex CV tasks and NLP tasks such as image classification, image generation, and machine translation. Most state-of-the-art neural networks are over-parameterized and require a high computational cost. One straightforward solution is to replace the layers of the networks with their low-rank tensor approximations using different tensor decomposition methods. This paper reviews six tensor decomposition methods and illustrates their ability to compress model parameters of convolutional neural networks (CNNs), recurrent neural networks (RNNs) and Transformers. The accuracy of some compressed models can be higher than the original versions. Evaluations indicate that tensor decompositions can achieve significant reductions in model size,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTensor decomposition and applications
