Dynamic ConvNets on Tiny Devices via Nested Sparsity
Matteo Grimaldi, Luca Mocerino, Antonio Cipolletta, Andrea Calimera

TL;DR
This paper presents Nested Sparse ConvNets, a novel dynamic neural network architecture for resource-constrained edge devices, offering a flexible accuracy-latency trade-off with efficient training, compression, and inference techniques.
Contribution
It introduces a new training and compression pipeline for Nested Sparse ConvNets, enabling dynamic accuracy-latency trade-offs on tiny devices with improved storage and performance.
Findings
Outperform naive sparse models in accuracy and storage.
Achieve comparable accuracy with significant storage savings.
Pareto optimal in accuracy vs. latency compared to other dynamic methods.
Abstract
This work introduces a new training and compression pipeline to build Nested Sparse ConvNets, a class of dynamic Convolutional Neural Networks (ConvNets) suited for inference tasks deployed on resource-constrained devices at the edge of the Internet-of-Things. A Nested Sparse ConvNet consists of a single ConvNet architecture containing N sparse sub-networks with nested weights subsets, like a Matryoshka doll, and can trade accuracy for latency at run time, using the model sparsity as a dynamic knob. To attain high accuracy at training time, we propose a gradient masking technique that optimally routes the learning signals across the nested weights subsets. To minimize the storage footprint and efficiently process the obtained models at inference time, we introduce a new sparse matrix compression format with dedicated compute kernels that fruitfully exploit the characteristic of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Machine Learning and ELM
MethodsPruning
