The Landscape of GPU-Centric Communication

Didem Unat; Ilyas Turimbetov; Mohammed Kefah Taha Issa; Do\u{g}an Sa\u{g}bili; Flavio Vella; Daniele De Sensi; Ismayil Ismayilov

arXiv:2409.09874·cs.DC·April 24, 2026

The Landscape of GPU-Centric Communication

Didem Unat, Ilyas Turimbetov, Mohammed Kefah Taha Issa, Do\u{g}an Sa\u{g}bili, Flavio Vella, Daniele De Sensi, Ismayil Ismayilov

PDF

TL;DR

This paper surveys GPU-centric communication methods, vendor tools, and libraries, highlighting their roles in improving multi-GPU scalability and performance in HPC and ML applications.

Contribution

It provides a comprehensive landscape of GPU-centric communication, clarifying terminology, categorizing approaches, and discussing future research directions.

Findings

01

Vendor mechanisms reduce CPU involvement in multi-GPU communication.

02

Major libraries offer diverse benefits and face specific challenges.

03

Performance insights guide optimal exploitation of multi-GPU systems.

Abstract

In recent years, GPUs have become the preferred accelerators for HPC and ML applications due to their parallelism and fast memory bandwidth. While GPUs boost computation, inter-GPU communication can create scalability bottlenecks, especially as the number of GPUs per node and cluster grows. Traditionally, the CPU managed multi-GPU communication, but advancements in GPU-centric communication now challenge this CPU dominance by reducing its involvement, granting GPUs more autonomy in communication tasks, and addressing mismatches in multi-GPU communication and computation. This paper provides a landscape of GPU-centric communication, focusing on vendor mechanisms and user-level library supports. It aims to clarify the complexities and diverse options in this field, define the terminology, and categorize existing approaches within and across nodes. The paper discusses vendor-provided…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.