Real-World Modeling of Computation Offloading for Neural Networks with Early Exits and Splits
Jan Danek, Zdenek Becvar, Adam Janes

TL;DR
This paper investigates computation offloading of CNNs with early exits and splits from mobile devices to edge servers, demonstrating reduced delay and energy use without sacrificing accuracy through real-world experiments.
Contribution
It introduces a practical approach for CNN offloading with early exits and splits, supported by real-world data and models for delay and energy consumption.
Findings
Offloading reduces processing delay significantly.
Energy consumption is lowered with offloading.
Classification accuracy remains unaffected.
Abstract
We focus on computation offloading of applications based on convolutional neural network (CNN) from moving devices, such as mobile robots or autonomous vehicles, to MultiAccess Edge Computing (MEC) servers via a mobile network. In order to reduce overall CNN inference time, we design and implement CNN with early exits and splits, allowing a flexible partial or full offloading of CNN inference. Through real-world experiments, we analyze an impact of the CNN inference offloading on the total CNN processing delay, energy consumption, and classification accuracy in a practical road sign recognition task. The results confirm that offloading of CNN with early exits and splits can significantly reduce both total processing delay and energy consumption compared to full local processing while not impairing classification accuracy. Based on the results of real-world experiments, we derive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIoT and Edge/Fog Computing · Age of Information Optimization · Advanced Neural Network Applications
