TL;DR
This paper introduces a supervised compression method for edge computing that uses a teacher-student model with a stochastic bottleneck, improving rate-distortion performance and latency in resource-constrained environments.
Contribution
It proposes a novel supervised feature compression technique combining knowledge distillation and neural image compression, outperforming existing methods in efficiency and flexibility.
Findings
Achieves better rate-distortion trade-offs than baselines.
Maintains smaller end-to-end latency.
Feature representations can be adapted for multiple tasks.
Abstract
There has been much interest in deploying deep learning algorithms on low-powered devices, including smartphones, drones, and medical sensors. However, full-scale deep neural networks are often too resource-intensive in terms of energy and storage. As a result, the bulk part of the machine learning operation is therefore often carried out on an edge server, where the data is compressed and transmitted. However, compressing data (such as images) leads to transmitting information irrelevant to the supervised task. Another popular approach is to split the deep network between the device and the server while compressing intermediate features. To date, however, such split computing strategies have barely outperformed the aforementioned naive data compression baselines due to their inefficient approaches to feature compression. This paper adopts ideas from knowledge distillation and neural…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Supervised Compression for Resource-Constrained Edge Computing Systems· youtube
Taxonomy
MethodsKnowledge Distillation
