Deep Learning on Mobile Devices Through Neural Processing Units and Edge Computing
Tianxiang Tan, Guohong Cao

TL;DR
This paper introduces a Confidence Based Offloading framework for mobile deep learning video analytics, optimizing accuracy and latency by intelligently deciding when to process locally or offload to servers based on confidence scores.
Contribution
It proposes confidence score calibration and an adaptive offloading algorithm to improve DNN accuracy and efficiency on resource-limited mobile devices.
Findings
Significant accuracy improvements over existing methods
Effective offloading decisions based on confidence calibration
Enhanced performance under varying network conditions
Abstract
Deep Neural Network (DNN) is becoming adopted for video analytics on mobile devices. To reduce the delay of running DNNs, many mobile devices are equipped with Neural Processing Units (NPU). However, due to the resource limitations of NPU, these DNNs have to be compressed to increase the processing speed at the cost of accuracy. To address the low accuracy problem, we propose a Confidence Based Offloading (CBO) framework for deep learning video analytics. The major challenge is to determine when to return the NPU classification result based on the confidence level of running the DNN, and when to offload the video frames to the server for further processing to increase the accuracy. We first identify the problem of using existing confidence scores to make offloading decisions, and propose confidence score calibration techniques to improve the performance. Then, we formulate the CBO…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIoT and Edge/Fog Computing · Age of Information Optimization · Advanced Neural Network Applications
