Compute Can't Handle the Truth: Why Communication Tax Prioritizes Memory and Interconnects in Modern AI Infrastructure

Myoungsoo Jung

arXiv:2507.07223·cs.DC·July 15, 2025

Compute Can't Handle the Truth: Why Communication Tax Prioritizes Memory and Interconnects in Modern AI Infrastructure

Myoungsoo Jung

PDF

Open Access

TL;DR

This paper analyzes the scalability challenges of modern AI hardware and proposes a modular, disaggregated data center architecture utilizing CXL and optimized interconnects to improve scalability and efficiency.

Contribution

It introduces a novel modular data center architecture with CXL and hybrid interconnects, addressing scalability bottlenecks in AI hardware infrastructure.

Findings

01

Enhanced scalability and throughput demonstrated in evaluations.

02

Hybrid CXL-over-XLink design reduces data transfer overhead.

03

Hierarchical memory models improve resource flexibility.

Abstract

Modern AI workloads such as large language models (LLMs) and retrieval-augmented generation (RAG) impose severe demands on memory, communication bandwidth, and resource flexibility. Traditional GPU-centric architectures struggle to scale due to growing inter-GPU communication overheads. This report introduces key AI concepts and explains how Transformers revolutionized data representation in LLMs. We analyze large-scale AI hardware and data center designs, identifying scalability bottlenecks in hierarchical systems. To address these, we propose a modular data center architecture based on Compute Express Link (CXL) that enables disaggregated scaling of memory, compute, and accelerators. We further explore accelerator-optimized interconnects-collectively termed XLink (e.g., UALink, NVLink, NVLink Fusion)-and introduce a hybrid CXL-over-XLink design to reduce long-distance data transfers…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputability, Logic, AI Algorithms