Resource-Constrained Edge AI with Early Exit Prediction
Rongkang Dong, Yuyi Mao, Jun Zhang

TL;DR
This paper introduces an early exit prediction mechanism with a low-complexity Exit Predictor to reduce computation overhead in resource-constrained edge AI, improving inference efficiency and accuracy under varying bandwidths.
Contribution
It proposes a novel Exit Predictor module and a latency-aware extension for early-exit networks, optimizing resource use and performance in edge AI systems.
Findings
Significant reduction in on-device computation overhead.
Improved inference accuracy under bandwidth constraints.
Effective tradeoff between accuracy and efficiency achieved.
Abstract
By leveraging the data sample diversity, the early-exit network recently emerges as a prominent neural network architecture to accelerate the deep learning inference process. However, intermediate classifiers of the early exits introduce additional computation overhead, which is unfavorable for resource-constrained edge artificial intelligence (AI). In this paper, we propose an early exit prediction mechanism to reduce the on-device computation overhead in a device-edge co-inference system supported by early-exit networks. Specifically, we design a low-complexity module, namely the Exit Predictor, to guide some distinctly "hard" samples to bypass the computation of the early exits. Besides, considering the varying communication bandwidth, we extend the early exit prediction mechanism for latency-aware edge inference, which adapts the prediction thresholds of the Exit Predictor and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAge of Information Optimization · Advanced Memory and Neural Computing · IoT and Edge/Fog Computing
