Network Pruning for Low-Rank Binary Indexing

Dongsoo Lee; Se Jung Kwon; Byeongwook Kim; Parichay Kapoor; Gu-Yeon; Wei

arXiv:1905.05686·cs.LG·May 15, 2019·5 cites

Network Pruning for Low-Rank Binary Indexing

Dongsoo Lee, Se Jung Kwon, Byeongwook Kim, Parichay Kapoor, Gu-Yeon, Wei

PDF

Open Access

TL;DR

This paper introduces a novel network pruning method that creates low-rank binary index matrices, significantly reducing memory and improving compression for sparse DNN models while maintaining performance.

Contribution

It proposes a new pruning technique that decomposes sparse index matrices into binary factors and a tile-based factorization to enhance compression and reduce memory usage.

Findings

01

Achieves higher compression ratios than previous sparse formats.

02

Maintains pruning effectiveness with fewer indexes.

03

Reduces memory footprint for sparse neural network models.

Abstract

Pruning is an efficient model compression technique to remove redundancy in the connectivity of deep neural networks (DNNs). Computations using sparse matrices obtained by pruning parameters, however, exhibit vastly different parallelism depending on the index representation scheme. As a result, fine-grained pruning has not gained much attention due to its irregular index form leading to large memory footprint and low parallelism for convolutions and matrix multiplications. In this paper, we propose a new network pruning technique that generates a low-rank binary index matrix to compress index data while decompressing index data is performed by simple binary matrix multiplication. This proposed compression method finds a particular fine-grained pruning mask that can be decomposed into two binary matrices. We also propose a tile-based factorization technique that not only lowers memory…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Advanced Graph Neural Networks · Tensor decomposition and applications

MethodsPruning