MultiTASC: A Multi-Tenancy-Aware Scheduler for Cascaded DNN Inference at   the Consumer Edge

Sokratis Nikolaidis; Stylianos I. Venieris; Iakovos S. Venieris

arXiv:2306.12830·cs.LG·June 23, 2023·1 cites

MultiTASC: A Multi-Tenancy-Aware Scheduler for Cascaded DNN Inference at the Consumer Edge

Sokratis Nikolaidis, Stylianos I. Venieris, Iakovos S. Venieris

PDF

Open Access

TL;DR

MultiTASC is a scheduler designed for cascaded DNN inference in multi-device consumer environments, optimizing throughput, accuracy, and latency amidst device heterogeneity.

Contribution

It introduces a multi-tenancy-aware scheduling approach that adaptively manages inference forwarding decisions to enhance system performance in diverse device settings.

Findings

01

Improves latency SLO satisfaction rate by 20-25 percentage points.

02

Serves over 40 devices simultaneously, demonstrating scalability.

03

Outperforms state-of-the-art cascade methods in heterogeneous setups.

Abstract

Cascade systems comprise a two-model sequence, with a lightweight model processing all samples and a heavier, higher-accuracy model conditionally refining harder samples to improve accuracy. By placing the light model on the device side and the heavy model on a server, model cascades constitute a widely used distributed inference approach. With the rapid expansion of intelligent indoor environments, such as smart homes, the new setting of Multi-Device Cascade is emerging where multiple and diverse devices are to simultaneously use a shared heavy model on the same server, typically located within or close to the consumer environment. This work presents MultiTASC, a multi-tenancy-aware scheduler that adaptively controls the forwarding decision functions of the devices in order to maximize the system throughput, while sustaining high accuracy and low latency. By explicitly considering…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAge of Information Optimization · Context-Aware Activity Recognition Systems · Opportunistic and Delay-Tolerant Networks