Benchmarking Energy and Latency in TinyML: A Novel Method for Resource-Constrained AI

Pietro Bartoli; Christian Veronesi; Andrea Giudici; David Siorpaes; Diana Trojaniello; Franco Zappa

arXiv:2505.15622·cs.LG·December 1, 2025

Benchmarking Energy and Latency in TinyML: A Novel Method for Resource-Constrained AI

Pietro Bartoli, Christian Veronesi, Andrea Giudici, David Siorpaes, Diana Trojaniello, Franco Zappa

PDF

TL;DR

This paper presents a comprehensive benchmarking method for TinyML devices that measures energy and latency across different execution phases, enabling better comparison and optimization of resource-constrained AI systems.

Contribution

It introduces a novel benchmarking approach that separately evaluates energy and latency during pre-inference, inference, and post-inference phases, with automated, statistically robust testing on MCU hardware.

Findings

01

Reducing core voltage and clock frequency improves pre- and post-processing efficiency.

02

The methodology allows cross-platform comparison of TinyML inference devices.

03

Testing 1000 runs per model ensures statistically significant results.

Abstract

The rise of IoT has increased the need for on-edge machine learning, with TinyML emerging as a promising solution for resource-constrained devices such as MCU. However, evaluating their performance remains challenging due to diverse architectures and application scenarios. Current solutions have many non-negligible limitations. This work introduces an alternative benchmarking methodology that integrates energy and latency measurements while distinguishing three execution phases pre-inference, inference, and post-inference. Additionally, the setup ensures that the device operates without being powered by an external measurement unit, while automated testing can be leveraged to enhance statistical significance. To evaluate our setup, we tested the STM32N6 MCU, which includes a NPU for executing neural networks. Two configurations were considered: high-performance and Low-power. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.