Optimizing Split Learning Latency in TinyML-Based IoT Systems
Zied Jenhani, Mounir Bensalem, Jasenka Dizdarevi\'c, Admela Jukan

TL;DR
This paper benchmarks and optimizes split learning latency on TinyML IoT devices, proposing a Beam Search algorithm for split point selection to minimize inference delay.
Contribution
It provides the first experimental latency benchmark of TinyML split learning on ESP32-S3 and introduces a Beam Search-based method for split point optimization.
Findings
ESP-NOW protocol achieves the lowest RTT of 3.6 seconds.
The Beam Search algorithm delivers near-optimal latency with 0.1 seconds processing time for 5 devices.
Analysis of different split points impacts communication and computation overhead.
Abstract
Split learning (SL) addresses the limitation of running deep learning inference directly on low-power edge/IoT nodes, in which it executes part of the inference process on the sensor and offloading the remainder to a companion device. Despite its promise, the inference latency of SL on constrained hardware under realistic low-power wireless protocols remains unexplored. This paper presents the first experimental latency benchmark of TinyML-based SL on ESP32-S3 boards, comparing four wireless communication protocol solutions (UDP, TCP, ESP-NOW, BLE). We also analyze the impact of the choice of different split points across different models (MobileNet-V2 and ResNet50) in terms of communication and computation overhead as a way to minimize the end-to-end inference latency. We propose a Beam Search-based algorithm for split point optimization that minimizes end-to-end latency, and compare…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
