Ultimate tensorization: compressing convolutional and FC layers alike
Timur Garipov, Dmitry Podoprikhin, Alexander Novikov, Dmitry Vetrov

TL;DR
This paper introduces a novel tensorization method that significantly compresses convolutional and fully-connected layers in neural networks, achieving high compression rates with minimal accuracy loss.
Contribution
It extends tensor factorization techniques to convolutional layers by reshaping kernels into higher-order tensors, improving compression performance.
Findings
Achieved 80x network compression rate
Minimal 1.1% accuracy drop on CIFAR-10
Effective compression of both convolutional and fully-connected layers
Abstract
Convolutional neural networks excel in image recognition tasks, but this comes at the cost of high computational and memory complexity. To tackle this problem, [1] developed a tensor factorization framework to compress fully-connected layers. In this paper, we focus on compressing convolutional layers. We show that while the direct application of the tensor framework [1] to the 4-dimensional kernel of convolution does compress the layer, we can do better. We reshape the convolutional kernel into a tensor of higher order and factorize it. We combine the proposed approach with the previous work to compress both convolutional and fully-connected layers of a network and achieve 80x network compression rate with 1.1% accuracy drop on the CIFAR-10 dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTensor decomposition and applications · Advanced Neural Network Applications · Sparse and Compressive Sensing Techniques
MethodsConvolution
