Cerberus: Cross-Layer ECC Co-Design for Robust and Efficient Memory Protection
Junhwan Kim, Seunghyun Kim, Yesin Ryu, Saeid Gorgin, and Jungrae Kim

TL;DR
Cerberus introduces a cross-layer ECC co-design that unifies memory protection across device, link, and system layers, enhancing resilience and reducing redundancy in next-generation memory systems.
Contribution
It proposes a novel Encode-Once, Decode-Many architecture for unified ECC, optimizing redundancy and correction strategies across layers.
Findings
Improved resilience to clustered and peripheral faults.
Reduced redundant overhead in memory protection.
Effective coordination of ECC layers enhances system robustness.
Abstract
As DRAM scales to higher density and I/O speeds, ensuring data correctness becomes increasingly difficult. Industry has responded with a three-layer stack: on-die ECC (O-ECC), link ECC (L-ECC), and system ECC (S-ECC). However, these layers have evolved independently, often duplicating redundancy, leaving coverage gaps, and occasionally interfering. We propose Cerberus, a cross-layer ECC co-design that unifies protection across device, link, and system while preserving the native role of each layer. At its core is an Encode-Once, Decode-Many (EODM) architecture: the controller performs a single encoding whose redundancy is reused by L-ECC for immediate write-path detection and retry, by O-ECC for in-device repair on reads, and by S-ECC for strong end-to-end recovery. Cerberus jointly designs complementary parity and syndrome structures, orders decoders, and allocates the correction…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
