SEIFER: Scalable Edge Inference for Deep Neural Networks

Arjun Parthasarathy; Bhaskar Krishnamachari

arXiv:2210.12218·cs.NI·November 21, 2022

SEIFER: Scalable Edge Inference for Deep Neural Networks

Arjun Parthasarathy, Bhaskar Krishnamachari

PDF

Open Access 1 Repo

TL;DR

SEIFER is a scalable framework that enables efficient, fault-tolerant deployment of deep neural networks across distributed edge devices using Kubernetes, significantly improving inference throughput.

Contribution

It introduces a novel edge inference framework that partitions and distributes DNNs over resource-constrained edge networks with fault tolerance and automatic updates.

Findings

01

Inference throughput improved by 200% with sufficient nodes

02

Framework supports fault tolerance and automatic model updates

03

Open-source implementation available for research community

Abstract

Edge inference is becoming ever prevalent through its applications from retail to wearable technology. Clusters of networked resource-constrained edge devices are becoming common, yet there is no production-ready orchestration system for deploying deep learning models over such edge networks which adopts the robustness and scalability of the cloud. We present SEIFER, a framework utilizing a standalone Kubernetes cluster to partition a given DNN and place these partitions in a distributed manner across an edge network, with the goal of maximizing inference throughput. The system is node fault-tolerant and automatically updates deployments based on updates to the model's version. We provide a preliminary evaluation of a partitioning and placement algorithm that works within this framework, and show that we can improve the inference pipeline throughput by 200% by utilizing sufficient…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

anrgusc/seifer
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · IoT and Edge/Fog Computing · Advanced Memory and Neural Computing