DeepNVM++: Cross-Layer Modeling and Optimization Framework of   Non-Volatile Memories for Deep Learning

Ahmet Inci; Mehmet Meric Isgenc; Diana Marculescu

arXiv:2012.04559·cs.AR·May 23, 2022

DeepNVM++: Cross-Layer Modeling and Optimization Framework of Non-Volatile Memories for Deep Learning

Ahmet Inci, Mehmet Meric Isgenc, Diana Marculescu

PDF

TL;DR

DeepNVM++ is a comprehensive framework that models and optimizes non-volatile memory caches in GPU architectures, significantly improving energy efficiency and capacity for deep learning workloads compared to traditional SRAM.

Contribution

It introduces a cross-layer modeling framework for NVM-based GPU caches, combining circuit-level models with workload behavior for deep learning applications.

Findings

01

STT-MRAM and SOT-MRAM reduce energy-delay product by up to 4.7x and 2.3x compared to SRAM.

02

These NVM technologies offer up to 3.8x energy-delay reduction and 2.8x area reduction in iso-capacity scenarios.

03

The framework enables scalable analysis of NVM benefits for large cache capacities.

Abstract

Non-volatile memory (NVM) technologies such as spin-transfer torque magnetic random access memory (STT-MRAM) and spin-orbit torque magnetic random access memory (SOT-MRAM) have significant advantages compared to conventional SRAM due to their non-volatility, higher cell density, and scalability features. While previous work has investigated several architectural implications of NVM for generic applications, in this work we present DeepNVM++, a framework to characterize, model, and analyze NVM-based caches in GPU architectures for deep learning (DL) applications by combining technology-specific circuit-level models and the actual memory behavior of various DL workloads. We present both iso-capacity and iso-area performance and energy analysis for systems whose last-level caches rely on conventional SRAM and emerging STT-MRAM and SOT-MRAM technologies. In the iso-capacity case, STT-MRAM…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.