Network Memory Footprint Compression Through Jointly Learnable Codebooks   and Mappings

Edouard Yvinec; Arnaud Dapogny; Kevin Bailly

arXiv:2309.17361·cs.CV·October 2, 2023

Network Memory Footprint Compression Through Jointly Learnable Codebooks and Mappings

Edouard Yvinec, Arnaud Dapogny, Kevin Bailly

PDF

Open Access 1 Video

TL;DR

This paper introduces JLCM, a novel method for neural network compression that jointly learns codebooks and mappings, enabling significant memory reduction suitable for mobile devices.

Contribution

It proposes a joint learning approach for codebooks and mappings, addressing limitations of existing quantization methods and enabling efficient DNN compression.

Findings

01

Llama 7B compressed to 2GB

02

Achieves efficient approximation of DNNs

03

Enables deployment on old smartphones

Abstract

The massive interest in deep neural networks (DNNs) for both computer vision and natural language processing has been sparked by the growth in computational power. However, this led to an increase in the memory footprint, to a point where it can be challenging to simply load a model on commodity devices such as mobile phones. To address this limitation, quantization is a favored solution as it maps high precision tensors to a low precision, memory efficient format. In terms of memory footprint reduction, its most effective variants are based on codebooks. These methods, however, suffer from two limitations. First, they either define a single codebook for each tensor, or use a memory-expensive mapping to multiple codebooks. Second, gradient descent optimization of the mapping favors jumps toward extreme values, hence not defining a proximal search. In this work, we propose to address…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Network Memory Footprint Compression Through Jointly Learnable Codebooks and Mappings· slideslive

Taxonomy

TopicsAdvanced Neural Network Applications · Human Pose and Action Recognition · Multimodal Machine Learning Applications