EdgeSight: Enabling Modeless and Cost-Efficient Inference at the Edge
ChonLam Lao, Jiaqi Gao, Ganesh Ananthanarayanan, Aditya Akella, Minlan, Yu

TL;DR
EdgeSight is a system designed to enable cost-efficient, modeless deep neural network inference at the edge, addressing challenges like limited memory, network volatility, and power constraints, and outperforming existing solutions in latency and power efficiency.
Contribution
We introduce EdgeSight, a novel edge-data center architecture that supports cost-efficient, modeless inference with confidence scaling and lossy inference, optimized for edge device constraints.
Findings
EdgeSight reduces P99 latency by up to 1.6x compared to existing systems.
Our FPGA prototype achieves similar performance with up to 3.34x power savings.
EdgeSight effectively supports diverse DNNs in volatile network environments.
Abstract
Traditional ML inference is evolving toward modeless inference, which abstracts the complexity of model selection from users, allowing the system to automatically choose the most appropriate model for each request based on accuracy and resource requirements. While prior studies have focused on modeless inference within data centers, this paper tackles the pressing need for cost-efficient modeless inference at the edge -- particularly within its unique constraints of limited device memory, volatile network conditions, and restricted power consumption. To overcome these challenges, we propose EdgeSight, a system that provides cost-efficient EdgeSight serving for diverse DNNs at the edge. EdgeSight employs an edge-data center (edge-DC) architecture, utilizing confidence scaling to reduce the number of model options while meeting diverse accuracy requirements. Additionally, it supports…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Scientific Computing and Data Management · Gaussian Processes and Bayesian Inference
