Designing Efficient and High-performance AI Accelerators with Customized STT-MRAM
Kaniz Mishty, Mehdi Sadi

TL;DR
This paper presents a novel AI accelerator design using customized STT-MRAM technology, optimizing performance, energy, and area through model-driven design and process-aware adjustments, demonstrating significant improvements over SRAM-based systems.
Contribution
The paper introduces a reconfigurable, scratchpad-assisted STT-MRAM based buffer system and a design methodology for high-performance AI accelerators with process and temperature-aware optimization.
Findings
75% area savings over SRAM-based accelerators
3% power reduction with high accuracy
Achieves high performance with relaxed error rates
Abstract
In this paper, we demonstrate the design of efficient and high-performance AI/Deep Learning accelerators with customized STT-MRAM and a reconfigurable core. Based on model-driven detailed design space exploration, we present the design methodology of an innovative scratchpad-assisted on-chip STT-MRAM based buffer system for high-performance accelerators. Using analytically derived expression of memory occupancy time of AI model weights and activation maps, the volatility of STT-MRAM is adjusted with process and temperature variation aware scaling of thermal stability factor to optimize the retention time, energy, read/write latency, and area of STT-MRAM. From the analysis of modern AI workloads and accelerator implementation in 14nm technology, we verify the efficacy of our designed AI accelerator with STT-MRAM STT-AI. Compared to an SRAM-based implementation, the STT-AI accelerator…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsAttentive Walk-Aggregating Graph Neural Network
