Characterizing Network Requirements for GPU API Remoting in AI Applications
Tianxia Wang, Zhuofu Chen, Xingda Wei, Jinyu Gu, Rong Chen, and Haibo Chen

TL;DR
This paper analyzes the network latency and bandwidth requirements for GPU remoting in AI applications, demonstrating that with proper design, unmodified AI applications can run efficiently over commodity networks without performance loss.
Contribution
It provides a GPU-centric theoretical model to determine minimal network requirements for GPU remoting, enabling efficient AI application deployment.
Findings
Unmodified AI applications can run with minimal network demands.
Proper remoting design can improve AI application performance.
Commodity networks suffice for GPU remoting with low latency and bandwidth.
Abstract
GPU remoting is a promising technique for supporting AI applications. Networking plays a key role in enabling remoting. However, for efficient remoting, the network requirements in terms of latency and bandwidth are unknown. In this paper, we take a GPU-centric approach to derive the minimum latency and bandwidth requirements for GPU remoting, while ensuring no (or little) performance degradation for AI applications. Our study including theoretical model demonstrates that, with careful remoting design, unmodified AI applications can run on the remoting setup using commodity networking hardware without any overhead or even with better performance, with low network demands.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFerroelectric and Negative Capacitance Devices · Caching and Content Delivery · Stochastic Gradient Optimization Techniques
