Binary Early-Exit Network for Adaptive Inference on Low-Resource Devices
Aaqib Saeed

TL;DR
This paper introduces a binary neural network with an early-exit strategy that accelerates inference on low-resource devices, maintaining accuracy while significantly reducing latency and enabling better uncertainty estimation.
Contribution
It combines binary neural networks with an early-exiting mechanism, allowing adaptive inference without retraining for different efficiency levels, and provides insights into sample difficulty and class uncertainty.
Findings
Achieves latency under 6ms on audio classification tasks.
Provides favorable quality-efficiency trade-offs with controllable thresholds.
Enables estimation of sample difficulty and class uncertainty.
Abstract
Deep neural networks have significantly improved performance on a range of tasks with the increasing demand for computational resources, leaving deployment on low-resource devices (with limited memory and battery power) infeasible. Binary neural networks (BNNs) tackle the issue to an extent with extreme compression and speed-up gains compared to real-valued models. We propose a simple but effective method to accelerate inference through unifying BNNs with an early-exiting strategy. Our approach allows simple instances to exit early based on a decision threshold and utilizes output layers added to different intermediate layers to avoid executing the entire binary model. We extensively evaluate our method on three audio classification tasks and across four BNNs architectures. Our method demonstrates favorable quality-efficiency trade-offs while being controllable with an entropy-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Neural Networks and Applications · Anomaly Detection Techniques and Applications
