BitTrain: Sparse Bitmap Compression for Memory-Efficient Training on the   Edge

Abdelrahman Hosny; Marina Neseem; Sherief Reda

arXiv:2110.15362·cs.LG·November 1, 2021·1 cites

BitTrain: Sparse Bitmap Compression for Memory-Efficient Training on the Edge

Abdelrahman Hosny, Marina Neseem, Sherief Reda

PDF

Open Access 1 Repo

TL;DR

BitTrain introduces a novel bitmap compression technique exploiting activation sparsity to significantly reduce memory usage during training on edge devices, enabling more efficient and scalable edge AI.

Contribution

The paper proposes BitTrain, a new method that compresses activation memory during training using bitmap compression, improving memory efficiency without sacrificing accuracy.

Findings

01

Up to 34% reduction in memory footprint at 50% sparsity.

02

Over 70% sparsity achieved with further pruning, reducing memory by up to 56%.

03

Seamless integration with modern deep learning frameworks.

Abstract

Training on the Edge enables neural networks to learn continuously from new data after deployment on memory-constrained edge devices. Previous work is mostly concerned with reducing the number of model parameters which is only beneficial for inference. However, memory footprint from activations is the main bottleneck for training on the edge. Existing incremental training methods fine-tune the last few layers sacrificing accuracy gains from re-training the whole model. In this work, we investigate the memory footprint of training deep learning models, and use our observations to propose BitTrain. In BitTrain, we exploit activation sparsity and propose a novel bitmap compression technique that reduces the memory footprint during training. We save the activations in our proposed bitmap compression format during the forward pass of the training, and restore them during the backward pass…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

scale-lab/bittrain
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning

MethodsPruning