fabric-lib: RDMA Point-to-Point Communication for LLM Systems

Nandor Licker (1); Kevin Hu (1); Vladimir Zaytsev (1); Lequn Chen (1) ((1) Perplexity AI)

arXiv:2510.27656·cs.DC·April 14, 2026

fabric-lib: RDMA Point-to-Point Communication for LLM Systems

Nandor Licker (1), Kevin Hu (1), Vladimir Zaytsev (1), Lequn Chen (1) ((1) Perplexity AI)

PDF

1 Repo

TL;DR

fabric-lib provides a hardware-agnostic, high-throughput point-to-point communication library for large language model systems, enabling flexible, portable, and efficient data transfer across diverse NICs.

Contribution

It introduces fabric-lib, a novel abstraction layer that unifies NIC functionality for LLM systems, improving portability and performance.

Findings

01

Achieves 400 Gbps peak throughput on NVIDIA ConnectX-7 and AWS EFA.

02

Enables disaggregated inference with dynamic scaling in production systems.

03

Reduces latency for trillion-parameter RL weight updates to 1.3 seconds.

Abstract

Emerging Large Language Model (LLM) system patterns, such as disaggregated inference, Mixture-of-Experts (MoE) routing, and asynchronous reinforcement fine-tuning, require flexible point-to-point communication beyond simple collectives. Existing implementations are locked to specific Network Interface Controllers (NICs), hindering integration into inference engines and portability across hardware providers. We present fabric-lib, which bridges the functionality of common NICs to expose a uniform interface. fabric-lib exposes one-sided WriteImm operations with a ImmCounter primitive for completion notification, without ordering assumptions of network transport, transparently managing multiple NICs per GPU. We demonstrate peak throughput of 400 Gbps on both NVIDIA ConnectX-7 and AWS Elastic Fabric Adapter (EFA). We showcase fabric-lib through three production systems: (1) KvCache transfer…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

perplexityai/pplx-garden
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.