Tight Compression: Compressing CNN Through Fine-Grained Pruning and   Weight Permutation for Efficient Implementation

Xizi Chen; Jingyang Zhu; Jingbo Jiang; Chi-Ying Tsui

arXiv:2104.01303·cs.LG·November 22, 2024·1 cites

Tight Compression: Compressing CNN Through Fine-Grained Pruning and Weight Permutation for Efficient Implementation

Xizi Chen, Jingyang Zhu, Jingbo Jiang, Chi-Ying Tsui

PDF

Open Access

TL;DR

This paper introduces a novel weight permutation scheme combined with fine-grained pruning to significantly enhance CNN compression, leading to improved hardware efficiency and reduced energy consumption.

Contribution

It proposes a new permutation-based compression method that exploits fine-grained sparsity, achieving higher compression rates and better hardware utilization than existing approaches.

Findings

01

Matrix compression rate improved from 5.88x to 14.13x

02

Throughput increased by 2.75 times

03

Energy efficiency improved by 1.86 times

Abstract

The unstructured sparsity after pruning poses a challenge to the efficient implementation of deep learning models in existing regular architectures like systolic arrays. On the other hand, coarse-grained structured pruning is suitable for implementation in regular architectures but tends to have higher accuracy loss than unstructured pruning when the pruned models are of the same size. In this work, we propose a model compression method based on a novel weight permutation scheme to fully exploit the fine-grained weight sparsity in the hardware design. Through permutation, the optimal arrangement of the weight matrix is obtained, and the sparse weight matrix is further compressed to a small and dense format to make full use of the hardware resources. Two pruning granularities are explored. In addition to the unstructured weight pruning, we also propose a more fine-grained subword-level…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPolydiacetylene-based materials and applications · Plant Molecular Biology Research

MethodsPruning