HAPI: Hardware-Aware Progressive Inference
Stefanos Laskaridis, Stylianos I. Venieris, Hyeji Kim, Nicholas D., Lane

TL;DR
HAPI introduces a hardware-aware, co-optimized early-exit CNN framework that significantly improves inference speed and efficiency tailored to specific use-case requirements and hardware constraints.
Contribution
The paper presents a novel co-optimization methodology and an efficient design space exploration algorithm for high-performance early-exit CNNs tailored to deployment needs.
Findings
Outperforms existing early-exit schemes and search mechanisms.
Achieves up to 5.11x speedup on embedded devices.
Enhances performance of hand-crafted early-exit CNNs.
Abstract
Convolutional neural networks (CNNs) have recently become the state-of-the-art in a diversity of AI tasks. Despite their popularity, CNN inference still comes at a high computational cost. A growing body of work aims to alleviate this by exploiting the difference in the classification difficulty among samples and early-exiting at different stages of the network. Nevertheless, existing studies on early exiting have primarily focused on the training scheme, without considering the use-case requirements or the deployment platform. This work presents HAPI, a novel methodology for generating high-performance early-exit networks by co-optimising the placement of intermediate exits together with the early-exit strategy at inference time. Furthermore, we propose an efficient design space exploration algorithm which enables the faster traversal of a large number of alternative architectures and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsEarly exiting using confidence measures
