Exploring the Boundaries of On-Device Inference: When Tiny Falls Short, Go Hierarchical
Adarsh Prasad Behera, Paulius Daubaris, I\~naki Bravo, Jos\'e Gallego,, Roberto Morabito, Joerg Widmer, Jaya Prakash Varma Champati

TL;DR
This paper systematically evaluates hierarchical inference (HI) for on-device ML, showing it reduces latency and energy consumption compared to pure on-device inference, and introduces a hybrid system, EE-HI, for further improvements.
Contribution
It provides a comprehensive measurement-based comparison of HI and on-device inference across multiple devices and datasets, and proposes EE-HI to optimize performance.
Findings
HI achieves up to 73% lower latency than on-device inference.
HI reduces device energy consumption by up to 77%.
EE-HI further decreases latency by 59.7% and energy by 60.4%.
Abstract
On-device inference holds great potential for increased energy efficiency, responsiveness, and privacy in edge ML systems. However, due to less capable ML models that can be embedded in resource-limited devices, use cases are limited to simple inference tasks such as visual keyword spotting, gesture recognition, and predictive analytics. In this context, the Hierarchical Inference (HI) system has emerged as a promising solution that augments the capabilities of the local ML by offloading selected samples to an edge server or cloud for remote ML inference. Existing works demonstrate through simulation that HI improves accuracy. However, they do not account for the latency and energy consumption on the device, nor do they consider three key heterogeneous dimensions that characterize ML systems: hardware, network connectivity, and models. In contrast, this paper systematically compares the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Visualization and Analytics
