Automated Deep Neural Network Inference Partitioning for Distributed   Embedded Systems

Fabian Kre\ss; El Mahdi El Annabi; Tim Hotfilter; Julian Hoefer; Tanja; Harbaum; Juergen Becker

arXiv:2406.19913·cs.DC·October 14, 2024

Automated Deep Neural Network Inference Partitioning for Distributed Embedded Systems

Fabian Kre\ss, El Mahdi El Annabi, Tim Hotfilter, Julian Hoefer, Tanja, Harbaum, Juergen Becker

PDF

Open Access

TL;DR

This paper introduces a hardware-aware, graph-based framework for partitioning DNN inference across distributed embedded systems, improving performance and energy efficiency under strict constraints.

Contribution

It presents a novel automated layer scheduling method that optimally partitions DNNs considering system constraints and metrics.

Findings

01

Achieves up to 47.5% throughput increase for EfficientNet-B0

02

Demonstrates improved energy efficiency across six DNNs

03

Provides a systematic approach for hardware-aware DNN partitioning

Abstract

Distributed systems can be found in various applications, e.g., in robotics or autonomous driving, to achieve higher flexibility and robustness. Thereby, data flow centric applications such as Deep Neural Network (DNN) inference benefit from partitioning the workload over multiple compute nodes in terms of performance and energy-efficiency. However, mapping large models on distributed embedded systems is a complex task, due to low latency and high throughput requirements combined with strict energy and memory constraints. In this paper, we present a novel approach for hardware-aware layer scheduling of DNN inference in distributed embedded systems. Therefore, our proposed framework uses a graph-based algorithm to automatically find beneficial partitioning points in a given DNN. Each of these is evaluated based on several essential system metrics such as accuracy and memory utilization,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIndustrial Vision Systems and Defect Detection