DTMM: Deploying TinyML Models on Extremely Weak IoT Devices with Pruning
Lixiang Han, Zhen Xiao, Zhenjiang Li

TL;DR
DTMM is a library that enables efficient deployment of pruned TinyML models on extremely weak IoT devices like microcontrollers, addressing the challenge of model compression and runtime efficiency.
Contribution
The paper introduces DTMM, a comprehensive system with pruning and optimization techniques tailored for deploying TinyML models on low-end IoT devices, filling a gap in existing solutions.
Findings
Significant reduction in model size with minimal accuracy loss.
Improved runtime performance on microcontroller units.
Compatibility with commercial ML frameworks.
Abstract
DTMM is a library designed for efficient deployment and execution of machine learning models on weak IoT devices such as microcontroller units (MCUs). The motivation for designing DTMM comes from the emerging field of tiny machine learning (TinyML), which explores extending the reach of machine learning to many low-end IoT devices to achieve ubiquitous intelligence. Due to the weak capability of embedded devices, it is necessary to compress models by pruning enough weights before deploying. Although pruning has been studied extensively on many computing platforms, two key issues with pruning methods are exacerbated on MCUs: models need to be deeply compressed without significantly compromising accuracy, and they should perform efficiently after pruning. Current solutions only achieve one of these objectives, but not both. In this paper, we find that pruned models have great potential…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · IoT and Edge/Fog Computing · Machine Learning and Data Classification
MethodsLib · Pruning
