Characterizing Network Requirements for GPU API Remoting in AI   Applications

Tianxia Wang; Zhuofu Chen; Xingda Wei; Jinyu Gu; Rong Chen; and Haibo Chen

arXiv:2401.13354·cs.OS·January 25, 2024·1 cites

Characterizing Network Requirements for GPU API Remoting in AI Applications

Tianxia Wang, Zhuofu Chen, Xingda Wei, Jinyu Gu, Rong Chen, and Haibo Chen

PDF

Open Access

TL;DR

This paper analyzes the network latency and bandwidth requirements for GPU remoting in AI applications, demonstrating that with proper design, unmodified AI applications can run efficiently over commodity networks without performance loss.

Contribution

It provides a GPU-centric theoretical model to determine minimal network requirements for GPU remoting, enabling efficient AI application deployment.

Findings

01

Unmodified AI applications can run with minimal network demands.

02

Proper remoting design can improve AI application performance.

03

Commodity networks suffice for GPU remoting with low latency and bandwidth.

Abstract

GPU remoting is a promising technique for supporting AI applications. Networking plays a key role in enabling remoting. However, for efficient remoting, the network requirements in terms of latency and bandwidth are unknown. In this paper, we take a GPU-centric approach to derive the minimum latency and bandwidth requirements for GPU remoting, while ensuring no (or little) performance degradation for AI applications. Our study including theoretical model demonstrates that, with careful remoting design, unmodified AI applications can run on the remoting setup using commodity networking hardware without any overhead or even with better performance, with low network demands.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFerroelectric and Negative Capacitance Devices · Caching and Content Delivery · Stochastic Gradient Optimization Techniques