Dynamic ConvNets on Tiny Devices via Nested Sparsity

Matteo Grimaldi; Luca Mocerino; Antonio Cipolletta; Andrea Calimera

arXiv:2203.03324·cs.LG·March 8, 2022

Dynamic ConvNets on Tiny Devices via Nested Sparsity

Matteo Grimaldi, Luca Mocerino, Antonio Cipolletta, Andrea Calimera

PDF

Open Access

TL;DR

This paper presents Nested Sparse ConvNets, a novel dynamic neural network architecture for resource-constrained edge devices, offering a flexible accuracy-latency trade-off with efficient training, compression, and inference techniques.

Contribution

It introduces a new training and compression pipeline for Nested Sparse ConvNets, enabling dynamic accuracy-latency trade-offs on tiny devices with improved storage and performance.

Findings

01

Outperform naive sparse models in accuracy and storage.

02

Achieve comparable accuracy with significant storage savings.

03

Pareto optimal in accuracy vs. latency compared to other dynamic methods.

Abstract

This work introduces a new training and compression pipeline to build Nested Sparse ConvNets, a class of dynamic Convolutional Neural Networks (ConvNets) suited for inference tasks deployed on resource-constrained devices at the edge of the Internet-of-Things. A Nested Sparse ConvNet consists of a single ConvNet architecture containing N sparse sub-networks with nested weights subsets, like a Matryoshka doll, and can trade accuracy for latency at run time, using the model sparsity as a dynamic knob. To attain high accuracy at training time, we propose a gradient masking technique that optimally routes the learning signals across the nested weights subsets. To minimize the storage footprint and efficiently process the obtained models at inference time, we introduce a new sparse matrix compression format with dedicated compute kernels that fruitfully exploit the characteristic of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Machine Learning and ELM

MethodsPruning