Hardware-Aware DNN Compression for Homogeneous Edge Devices
Kunlong Zhang, Guiying Li, Ning Lu, Peng Yang, Ke Tang

TL;DR
This paper introduces HDAP, a hardware-aware DNN compression framework tailored for homogeneous edge devices, addressing device performance variability over time to optimize average model performance.
Contribution
HDAP is a novel device clustering and surrogate evaluation-based method that improves DNN compression for homogeneous edge devices with performance variability.
Findings
HDAP reduces average latency significantly, e.g., 2.86× on ResNet50.
It maintains competitive accuracy across device clusters.
Demonstrates effectiveness on multiple device types and tasks.
Abstract
Deploying deep neural networks (DNNs) across homogeneous edge devices (the devices with the same SKU labeled by the manufacturer) often assumes identical performance among them. However, once a device model is widely deployed, the performance of each device becomes different after a period of running. This is caused by the differences in user configurations, environmental conditions, manufacturing variances, battery degradation, etc. Existing DNN compression methods have not taken this scenario into consideration and can not guarantee good compression results in all homogeneous edge devices. To address this, we propose Homogeneous-Device Aware Pruning (HDAP), a hardware-aware DNN compression framework explicitly designed for homogeneous edge devices, aiming to achieve optimal average performance of the compressed model across all devices. To deal with the difficulty of time-consuming…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · IoT and Edge/Fog Computing · Big Data and Digital Economy
