Chip Guard ECC: An Efficient, Low Latency Method
Tanj Bennett

TL;DR
Chip Guard ECC introduces a scalable, low-latency error correction method tailored for DDR5 memory, capable of correcting chip faults with extremely high reliability and configurable metadata for enhanced fault detection.
Contribution
The paper presents a novel, scalable ECC approach optimized for DDR5, improving fault correction efficiency and reliability over existing methods.
Findings
Supports correction of all bounded chip faults
Achieves less than 1 in 10^12 failure probability for unbounded faults
Provides configurable metadata for reliability and probity tradeoffs
Abstract
Chip Guard is a new approach to symbol-correcting error correction codes. It can be scaled to various data burst sizes and reliability levels. A specific version for DDR5 is described. It uses the usual DDR5 configuration of 8 data chips, plus 2 chips for ECC and metadata, with 64-bit bursts per chip, to support whole-chip correction reliably and with high probity (reporting of uncorrectable faults). Various numbers of metadata bits may be supported with defined tradeoffs for reliability and probity. The method should correct all bounded faults of a single chip, with less than 1 in 10^12 chance of failing to correct unbounded faults in one chip, or less than 1 in 10^12 chance of failure to detect an uncorrected fault which affects multiple chips.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVLSI and Analog Circuit Testing · Radiation Effects in Electronics · Integrated Circuits and Semiconductor Failure Analysis
