UDC: Unified DNAS for Compressible TinyML Models
Igor Fedorov, Ramon Matas, Hokchhay Tann, Chuteng Zhou, Matthew, Mattina, Paul Whatmough

TL;DR
This paper introduces UDC, a neural architecture search method that generates highly compressible neural networks optimized for TinyML deployment on low-memory IoT devices, achieving state-of-the-art results.
Contribution
UDC explores a large search space to design neural networks that are both highly compressible and accurate for NPU deployment, advancing TinyML model efficiency.
Findings
UDC networks are up to 3.35x smaller at the same accuracy.
UDC achieves 6.25% higher accuracy at the same model size.
State-of-the-art results on ImageNet with compressed models.
Abstract
Deploying TinyML models on low-cost IoT hardware is very challenging, due to limited device memory capacity. Neural processing unit (NPU) hardware address the memory challenge by using model compression to exploit weight quantization and sparsity to fit more parameters in the same footprint. However, designing compressible neural networks (NNs) is challenging, as it expands the design space across which we must make balanced trade-offs. This paper demonstrates Unified DNAS for Compressible (UDC) NNs, which explores a large search space to generate state-of-the-art compressible NNs for NPU. ImageNet results show UDC networks are up to smaller (iso-accuracy) or 6.25% more accurate (iso-model size) than previous work.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Machine Learning and Algorithms · Machine Learning and ELM
MethodsDifferentiable Neural Architecture Search · Gumbel Softmax · Differentiable Neural Architecture Search
