Optimization Framework for Splitting DNN Inference Jobs over Computing   Networks

Sehun Jung; Hyang-Won Lee

arXiv:2111.07006·cs.NI·July 27, 2022

Optimization Framework for Splitting DNN Inference Jobs over Computing Networks

Sehun Jung, Hyang-Won Lee

PDF

Open Access

TL;DR

This paper introduces an optimization framework using a layered graph model to efficiently distribute DNN inference tasks across network resources, significantly reducing latency in AI services for 6G systems.

Contribution

It presents a novel layered graph model that reformulates DNN inference job splitting as a routing problem, enabling faster and more adaptive solutions.

Findings

01

Faster solution times compared to existing methods.

02

Adaptive node and path selection reduces inference latency.

03

Effective for resource-constrained end devices.

Abstract

Ubiquitous artificial intelligence (AI) is considered one of the key services in 6G systems. AI services typically rely on deep neural network (DNN) requiring heavy computation. Hence, in order to support ubiquitous AI, it is crucial to provide a solution for offloading or distributing computational burden due to DNN, especially at end devices with limited resources. We develop an optimization framework for assigning the computation tasks of DNN inference jobs to computing resources in the network, so as to reduce the inference latency. To this end, we propose a layered graph model with which simple conventional routing jointly solves the problem of selecting nodes for computation and paths for data transfer between nodes. We show that using our model, the existing approaches to splitting DNN inference jobs can be equivalently reformulated as a routing problem that possesses better…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIoT and Edge/Fog Computing · Age of Information Optimization · Stochastic Gradient Optimization Techniques