Dynamic DNN Decomposition for Lossless Synergistic Inference

Beibei Zhang; Tian Xiang; Hongxuan Zhang; Te Li; Shiqiang Zhu; Jianjun; Gu

arXiv:2101.05952·cs.DC·January 18, 2021

Dynamic DNN Decomposition for Lossless Synergistic Inference

Beibei Zhang, Tian Xiang, Hongxuan Zhang, Te Li, Shiqiang Zhu, Jianjun, Gu

PDF

Open Access

TL;DR

D3 is a dynamic DNN decomposition system that enables lossless, resource-adaptive, and parallelized inference across device, edge, and cloud, significantly improving efficiency and reducing communication overhead.

Contribution

The paper introduces a novel heuristic partitioning algorithm and parallel feature map processing strategy for synergistic DNN inference without accuracy loss.

Findings

01

D3 outperforms state-of-the-art methods by up to 3.4x in inference time.

02

Reduces communication overhead by up to 3.68x.

03

Supports dynamic adaptation to resource and network changes.

Abstract

Deep neural networks (DNNs) sustain high performance in today's data processing applications. DNN inference is resource-intensive thus is difficult to fit into a mobile device. An alternative is to offload the DNN inference to a cloud server. However, such an approach requires heavy raw data transmission between the mobile device and the cloud server, which is not suitable for mission-critical and privacy-sensitive applications such as autopilot. To solve this problem, recent advances unleash DNN services using the edge computing paradigm. The existing approaches split a DNN into two parts and deploy the two partitions to computation nodes at two edge computing tiers. Nonetheless, these methods overlook collaborative device-edge-cloud computation resources. Besides, previous algorithms demand the whole DNN re-partitioning to adapt to computation resource changes and network dynamics.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Memory and Neural Computing · IoT and Edge/Fog Computing