Managed-Retention Memory: A New Class of Memory for the AI Era
Sergey Legtchenko, Ioan Stefanovici, Richard Black, Antony Rowstron,, Junyi Liu, Paolo Costa, Burcu Canakci, Dushyanth Narayanan, Xingbo Wu

TL;DR
This paper introduces Managed-Retention Memory (MRM), a novel memory class optimized for AI inference workloads, addressing limitations of High Bandwidth Memory (HBM) by balancing density, read bandwidth, and energy efficiency.
Contribution
The paper proposes MRM as a new memory solution tailored for AI workloads, leveraging workload-specific trade-offs to improve performance and cost-effectiveness over traditional memory types.
Findings
MRM offers improved read bandwidth and energy efficiency for AI inference.
MRM demonstrates potential to utilize SCM technologies effectively.
Analysis shows MRM can outperform HBM in key AI workload metrics.
Abstract
AI clusters today are one of the major uses of High Bandwidth Memory (HBM). However, HBM is suboptimal for AI workloads for several reasons. Analysis shows HBM is overprovisioned on write performance, but underprovisioned on density and read bandwidth, and also has significant energy per bit overheads. It is also expensive, with lower yield than DRAM due to manufacturing complexity. We propose a new memory class: Managed-Retention Memory (MRM), which is more optimized to store key data structures for AI inference workloads. We believe that MRM may finally provide a path to viability for technologies that were originally proposed to support Storage Class Memory (SCM). These technologies traditionally offered long-term persistence (10+ years) but provided poor IO performance and/or endurance. MRM makes different trade-offs, and by understanding the workload IO patterns, MRM foregoes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFerroelectric and Negative Capacitance Devices · Parallel Computing and Optimization Techniques
