Toward Compact Parameter Representations for Architecture-Agnostic   Neural Network Compression

Yuezhou Sun; Wenlong Zhao; Lijun Zhang; Xiao Liu; Hui Guan; Matei; Zaharia

arXiv:2111.10320·cs.CV·November 22, 2021

Toward Compact Parameter Representations for Architecture-Agnostic Neural Network Compression

Yuezhou Sun, Wenlong Zhao, Lijun Zhang, Xiao Liu, Hui Guan, Matei, Zaharia

PDF

Open Access

TL;DR

This paper proposes a novel, architecture-agnostic method for neural network compression using cross-layer shared representations and additive quantization, achieving high compression ratios without accuracy loss.

Contribution

It introduces a simple, effective compression scheme that leverages shared parameter representations across layers, outperforming traditional pruning methods.

Findings

01

Achieves up to 15.3x compression with minimal accuracy loss.

02

Shared representations often occur across network layers.

03

Outperforms iterative unstructured pruning in experiments.

Abstract

This paper investigates deep neural network (DNN) compression from the perspective of compactly representing and storing trained parameters. We explore the previously overlooked opportunity of cross-layer architecture-agnostic representation sharing for DNN parameters. To do this, we decouple feedforward parameters from DNN architectures and leverage additive quantization, an extreme lossy compression method invented for image descriptors, to compactly represent the parameters. The representations are then finetuned on task objectives to improve task accuracy. We conduct extensive experiments on MobileNet-v2, VGG-11, ResNet-50, Feature Pyramid Networks, and pruned DNNs trained for classification, detection, and segmentation tasks. The conceptually simple scheme consistently outperforms iterative unstructured pruning. Applied to ResNet-50 with 76.1% top-1 accuracy on the ILSVRC12…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques