A System-Level Framework for Analytical and Empirical Reliability Exploration of STT-MRAM Caches
Elham Cheshmikhani, Hamed Farbeh, Hossein Asadi

TL;DR
This paper introduces a comprehensive system-level framework to analyze and characterize the overall reliability of STT-MRAM caches, considering multiple error types, workload behavior, and process variations, revealing significant vulnerability fluctuations.
Contribution
It presents the first integrated model for STT-MRAM cache vulnerability that accounts for error interdependencies, workload effects, and process variations, enhancing reliability assessment accuracy.
Findings
Total error rate varies by 32x across workloads.
Vulnerability increases by 6.5x due to process variations.
Error contributions differ significantly with cache access patterns.
Abstract
Spin-Transfer Torque Magnetic RAM (STT-MRAM) is known as the most promising replacement for SRAM technology in large Last-Level Caches (LLCs). Despite its high-density, non-volatility, near-zero leakage power, and immunity to radiation as the major advantages, STT-MRAM-based cache suffers from high error rates mainly due to retention failure, read disturbance, and write failure. Existing studies are limited to estimating the rate of only one or two of these error types for STT-MRAM cache. However, the overall vulnerability of STT-MRAM caches, which its estimation is a must to design cost-efficient reliable caches, has not been offered in any of previous studies. In this paper, we propose a system-level framework for reliability exploration and characterization of errors behavior in STT-MRAM caches. To this end, we formulate the cache vulnerability considering the inter-correlation of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemiconductor materials and devices · Advanced Data Storage Technologies · Parallel Computing and Optimization Techniques
