DEFER: Distributed Edge Inference for Deep Neural Networks

Arjun Parthasarathy; Bhaskar Krishnamachari

arXiv:2201.06769·cs.LG·January 19, 2022

DEFER: Distributed Edge Inference for Deep Neural Networks

Arjun Parthasarathy, Bhaskar Krishnamachari

PDF

1 Repo

TL;DR

DEFER is a distributed framework that partitions deep neural networks across multiple edge devices to improve inference throughput and reduce energy consumption, demonstrated on ResNet50 with significant gains.

Contribution

This paper introduces DEFER, a novel distributed edge inference framework that partitions DNNs across multiple nodes, enhancing throughput and energy efficiency compared to single-device inference.

Findings

01

53% higher inference throughput with 8 nodes

02

63% lower energy consumption per node

03

Reduced network payload with compression algorithms

Abstract

Modern machine learning tools such as deep neural networks (DNNs) are playing a revolutionary role in many fields such as natural language processing, computer vision, and the internet of things. Once they are trained, deep learning models can be deployed on edge computers to perform classification and prediction on real-time data for these applications. Particularly for large models, the limited computational and memory resources on a single edge device can become the throughput bottleneck for an inference pipeline. To increase throughput and decrease per-device compute load, we present DEFER (Distributed Edge inFERence), a framework for distributed edge inference, which partitions deep neural networks into layers that can be spread across multiple compute nodes. The architecture consists of a single "dispatcher" node to distribute DNN partitions and inference data to respective…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

anrgusc/defer
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.