DeepNVM++: Cross-Layer Modeling and Optimization Framework of Non-Volatile Memories for Deep Learning
Ahmet Inci, Mehmet Meric Isgenc, Diana Marculescu

TL;DR
DeepNVM++ is a comprehensive framework that models and optimizes non-volatile memory caches in GPU architectures, significantly improving energy efficiency and capacity for deep learning workloads compared to traditional SRAM.
Contribution
It introduces a cross-layer modeling framework for NVM-based GPU caches, combining circuit-level models with workload behavior for deep learning applications.
Findings
STT-MRAM and SOT-MRAM reduce energy-delay product by up to 4.7x and 2.3x compared to SRAM.
These NVM technologies offer up to 3.8x energy-delay reduction and 2.8x area reduction in iso-capacity scenarios.
The framework enables scalable analysis of NVM benefits for large cache capacities.
Abstract
Non-volatile memory (NVM) technologies such as spin-transfer torque magnetic random access memory (STT-MRAM) and spin-orbit torque magnetic random access memory (SOT-MRAM) have significant advantages compared to conventional SRAM due to their non-volatility, higher cell density, and scalability features. While previous work has investigated several architectural implications of NVM for generic applications, in this work we present DeepNVM++, a framework to characterize, model, and analyze NVM-based caches in GPU architectures for deep learning (DL) applications by combining technology-specific circuit-level models and the actual memory behavior of various DL workloads. We present both iso-capacity and iso-area performance and energy analysis for systems whose last-level caches rely on conventional SRAM and emerging STT-MRAM and SOT-MRAM technologies. In the iso-capacity case, STT-MRAM…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
