TL;DR
BottleNet++ introduces an end-to-end neural network-based feature compression method for device-edge co-inference, significantly reducing bandwidth and computation while maintaining high accuracy.
Contribution
It proposes a joint source-channel coding architecture with CNNs for feature compression, explicitly considering channel noise and enabling high compression ratios with strong generalization.
Findings
Achieves up to 64x bandwidth reduction in noisy channels
Attains 256x bit compression ratio with less than 2% accuracy loss
Enables earlier DNN splitting, reducing on-device computation by 3x
Abstract
The emergence of various intelligent mobile applications demands the deployment of powerful deep learning models at resource-constrained mobile devices. The device-edge co-inference framework provides a promising solution by splitting a neural network at a mobile device and an edge computing server. In order to balance the on-device computation and the communication overhead, the splitting point needs to be carefully picked, while the intermediate feature needs to be compressed before transmission. Existing studies decoupled the design of model splitting, feature compression, and communication, which may lead to excessive resource consumption of the mobile device. In this paper, we introduce an end-to-end architecture, named BottleNet++, that consists of an encoder, a non-trainable channel layer, and a decoder for more efficient feature compression and transmission. The encoder and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
