Scaling Out Chip Interconnect Networks with Implicit Sequence Numbers
Giyong Jung, Saeid Gorgin, John Kim, Jungrae Kim

TL;DR
This paper presents Implicit Sequence Numbers (ISN) and Reliability Extended Link (RXL), innovative mechanisms to enhance reliability and scalability of multi-chip interconnects like CXL, ensuring error detection and in-order delivery without header overhead.
Contribution
The paper introduces ISN for precise flit drop detection and RXL, an extension of CXL, enabling scalable, reliable multi-node interconnects with end-to-end sequence integrity.
Findings
ISN enables in-order delivery without header overhead.
RXL extends CXL for scalable, reliable interconnects.
Transport-layer CRC enhances end-to-end data integrity.
Abstract
As AI models outpace the capabilities of single processors, interconnects across chips have become a critical enabler for scalable computing. These processors exchange massive amounts of data at cache-line granularity, prompting the adoption of new interconnect protocols like CXL, NVLink, and UALink, designed for high bandwidth and small payloads. However, the increasing transfer rates of these protocols heighten susceptibility to errors. While mechanisms like Cyclic Redundancy Check (CRC) and Forward Error Correction (FEC) are standard for reliable data transmission, scaling chip interconnects to multi-node configurations introduces new challenges, particularly in managing silently dropped flits in switching devices. This paper introduces Implicit Sequence Number (ISN), a novel mechanism that ensures precise flit drop detection and in-order delivery without adding header overhead.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
