Ultra Ethernet's Design Principles and Architectural Innovations
Torsten Hoefler, Karen Schramm, Eric Spada, Keith Underwood, Cedell Alexander, Bob Alverson, Paul Bottorff, Adrian Caulfield, Mark Handley, Cathy Huang, Costin Raiciu, Abdul Kabbani, Eugene Opsasnick, Rong Pan, Adee Ran, Rip Sohan

TL;DR
Ultra Ethernet (UE) 1.0 introduces a high-performance Ethernet standard with innovative transport protocols designed for AI and HPC systems, emphasizing hardware acceleration and leveraging Ethernet's ecosystem for extreme-scale communication.
Contribution
The paper presents the design principles and architectural innovations of Ultra Ethernet, including the novel Ultra Ethernet Transport (UET) protocol for reliable, fast, and efficient high-performance networking.
Findings
UET enables hardware-accelerated, reliable communication in extreme-scale systems.
UE leverages Ethernet ecosystem for high-performance AI and HPC networking.
UE offers significant efficiency gains over previous standards like InfiniBand.
Abstract
The recently released Ultra Ethernet (UE) 1.0 specification defines a transformative High-Performance Ethernet standard for future Artificial Intelligence (AI) and High-Performance Computing (HPC) systems. This paper, written by the specification's authors, provides a high-level overview of UE's design, offering crucial motivations and scientific context to understand its innovations. While UE introduces advancements across the entire Ethernet stack, its standout contribution is the novel Ultra Ethernet Transport (UET), a potentially fully hardware-accelerated protocol engineered for reliable, fast, and efficient communication in extreme-scale systems. Unlike InfiniBand, the last major standardization effort in high-performance networking over two decades ago, UE leverages the expansive Ethernet ecosystem and the 1,000x gains in computational efficiency per moved bit to deliver a new…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Time Synchronization Technologies · Advanced Optical Network Technologies · Software-Defined Networks and 5G
