DNNShifter: An Efficient DNN Pruning System for Edge Computing
Bailey J. Eccles, Philip Rodgers, Peter Kilpatrick, Ivor Spence,, Blesson Varghese

TL;DR
DNNShifter is an efficient system for rapidly pruning deep neural networks into lightweight variants suitable for edge devices, enabling quick model switching with minimal overhead while maintaining high accuracy.
Contribution
It introduces a novel structured pruning methodology that produces smaller, faster models with near-original accuracy, and enables swift model switching for resource-constrained edge environments.
Findings
Pruned models are up to 93x faster to generate than traditional methods.
Pruned models are up to 5.14x smaller and 1.67x faster in inference.
Model switching overhead is reduced by up to 11.9x.
Abstract
Deep neural networks (DNNs) underpin many machine learning applications. Production quality DNN models achieve high inference accuracy by training millions of DNN parameters which has a significant resource footprint. This presents a challenge for resources operating at the extreme edge of the network, such as mobile and embedded devices that have limited computational and memory resources. To address this, models are pruned to create lightweight, more suitable variants for these devices. Existing pruning methods are unable to provide similar quality models compared to their unpruned counterparts without significant time costs and overheads or are limited to offline use cases. Our work rapidly derives suitable model variants while maintaining the accuracy of the original model. The model variants can be swapped quickly when system and network conditions change to match workload demand.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Memory and Neural Computing · Ferroelectric and Negative Capacitance Devices
MethodsPruning
