InstMeter: An Instruction-Level Method to Predict Energy and Latency of DL Model Inference on MCUs
Hao Liu, Qing Wang, Marco Zuniga

TL;DR
InstMeter is a linear, instruction-level predictor that accurately estimates energy and latency of deep learning inference on microcontrollers, significantly outperforming existing proxies with less data.
Contribution
We introduce InstMeter, a novel MCU-based predictor leveraging clock cycles for precise energy and latency estimation, reducing errors and data requirements compared to prior methods.
Findings
InstMeter reduces prediction errors by 3x for energy and 6.5x for latency.
It requires 100x less data than existing proxies.
It effectively guides NAS to optimize energy-efficient DL models.
Abstract
Deep learning (DL) models can now run on microcontrollers (MCUs). Through neural architecture search (NAS), we can search DL models that meet the constraints of MCUs. Among various constraints, energy and latency costs of the model inference are critical metrics. To predict them, existing research relies on coarse proxies such as multiply-accumulations (MACs) and model's input parameters, often resulting in inaccurate predictions or requiring extensive data collection. In this paper, we propose InstMeter, a predictor leveraging MCUs' clock cycles to accurately estimate the energy and latency of DL models. Clock cycles are fundamental metrics reflecting MCU operations, directly determining energy and latency costs. Furthermore, a unique property of our predictor is its strong linearity, allowing it to be simple and accurate. We thoroughly evaluate InstMeter under different scenarios,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Green IT and Sustainability · Advanced Neural Network Applications
