Cascade: A Platform for Delay-Sensitive Edge Intelligence

Weijia Song; Thiago Garrett; Yuting Yang; Mingzhao Liu; Edward Tremel,; Lorenzo Rosa; Andrea Merlina; Roman Vitenberg; and Ken Birman

arXiv:2311.17329·cs.OS·November 30, 2023·1 cites

Cascade: A Platform for Delay-Sensitive Edge Intelligence

Weijia Song, Thiago Garrett, Yuting Yang, Mingzhao Liu, Edward Tremel,, Lorenzo Rosa, Andrea Merlina, Roman Vitenberg, and Ken Birman

PDF

Open Access 1 Repo

TL;DR

Cascade is a new AI/ML platform designed for delay-sensitive edge applications, significantly reducing latency while maintaining high throughput through innovative data management and computation strategies.

Contribution

It introduces a legacy-friendly storage layer and a fast path for data-computation colocation, optimizing AI/ML hosting for low-latency edge intelligence.

Findings

01

Reduces latency by orders of magnitude

02

Maintains high throughput

03

Enhances responsiveness in edge AI applications

Abstract

Interactive intelligent computing applications are increasingly prevalent, creating a need for AI/ML platforms optimized to reduce per-event latency while maintaining high throughput and efficient resource management. Yet many intelligent applications run on AI/ML platforms that optimize for high throughput even at the cost of high tail-latency. Cascade is a new AI/ML hosting platform intended to untangle this puzzle. Innovations include a legacy-friendly storage layer that moves data with minimal copying and a "fast path" that collocates data and computation to maximize responsiveness. Our evaluation shows that Cascade reduces latency by orders of magnitude with no loss of throughput.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

derecho-project/cascade
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCloud Computing and Resource Management · IoT and Edge/Fog Computing · Scientific Computing and Data Management