AdaSpring: Context-adaptive and Runtime-evolutionary Deep Model Compression for Mobile Applications
Sicong Liu, Bin Guo, Ke Ma, Zhiwen Yu, Junzhao Du

TL;DR
AdaSpring is a novel framework that enables real-time, context-aware deep neural network compression on mobile devices, significantly improving efficiency without offline retraining.
Contribution
It introduces a self-evolutionary, ensemble training approach for online DNN compression that adapts to deployment context dynamically.
Findings
Achieves up to 3.1x latency reduction
Improves energy efficiency by 4.2x
Operates with less than 6.2ms evolution latency
Abstract
There are many deep learning (e.g., DNN) powered mobile and wearable applications today continuously and unobtrusively sensing the ambient surroundings to enhance all aspects of human lives. To enable robust and private mobile sensing, DNN tends to be deployed locally on the resource-constrained mobile devices via model compression. The current practice either hand-crafted DNN compression techniques, i.e., for optimizing DNN-relative performance (e.g., parameter size), or on-demand DNN compression methods, i.e., for optimizing hardware-dependent metrics (e.g., latency), cannot be locally online because they require offline retraining to ensure accuracy. Also, none of them have correlated their efforts with runtime adaptive compression to consider the dynamic nature of the deployment context of mobile applications. To address those challenges, we present AdaSpring, a context-adaptive and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
