Communication-Computation Efficient Device-Edge Co-Inference via AutoML

Xinjie Zhang; Jiawei Shao; Yuyi Mao; and Jun Zhang

arXiv:2108.13009·cs.LG·September 1, 2021

Communication-Computation Efficient Device-Edge Co-Inference via AutoML

Xinjie Zhang, Jiawei Shao, Yuyi Mao, and Jun Zhang

PDF

Open Access

TL;DR

This paper introduces an AutoML framework based on deep reinforcement learning to optimize device-edge co-inference, balancing computation and communication costs for faster, more efficient neural network inference.

Contribution

It proposes a novel AutoML approach to automatically select model split points and compression parameters, improving inference speed and efficiency in device-edge co-inference.

Findings

01

Achieves better communication-computation trade-off

02

Significant inference speedup over baseline schemes

03

Effective hyper-parameter optimization for co-inference

Abstract

Device-edge co-inference, which partitions a deep neural network between a resource-constrained mobile device and an edge server, recently emerges as a promising paradigm to support intelligent mobile applications. To accelerate the inference process, on-device model sparsification and intermediate feature compression are regarded as two prominent techniques. However, as the on-device model sparsity level and intermediate feature compression ratio have direct impacts on computation workload and communication overhead respectively, and both of them affect the inference accuracy, finding the optimal values of these hyper-parameters brings a major challenge due to the large search space. In this paper, we endeavor to develop an efficient algorithm to determine these hyper-parameters. By selecting a suitable model split point and a pair of encoder/decoder for the intermediate feature…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIoT and Edge/Fog Computing · Machine Learning and ELM · Advanced Neural Network Applications