Accelerating Neural Network Inference with Processing-in-DRAM: From the   Edge to the Cloud

Geraldo F. Oliveira; Juan G\'omez-Luna; Saugata Ghose; Amirali; Boroumand; Onur Mutlu

arXiv:2209.08938·cs.AR·March 28, 2023·1 cites

Accelerating Neural Network Inference with Processing-in-DRAM: From the Edge to the Cloud

Geraldo F. Oliveira, Juan G\'omez-Luna, Saugata Ghose, Amirali, Boroumand, Onur Mutlu

PDF

Open Access

TL;DR

This paper compares DRAM-based processing-in-memory architectures to evaluate their performance and energy efficiency in accelerating neural network inference from edge devices to the cloud.

Contribution

It provides a comprehensive analysis of three state-of-the-art DRAM PIM architectures, highlighting their trade-offs and suitability for different neural network workloads.

Findings

01

UPMEM outperforms high-end GPU by 23x in memory-bound tasks

02

Mensa improves energy efficiency and throughput by over 3x compared to Google Edge TPU

03

SIMDRAM surpasses CPU/GPU by 16.7x/1.4x in binary neural networks

Abstract

Neural networks (NNs) are growing in importance and complexity. A neural network's performance (and energy efficiency) can be bound either by computation or memory resources. The processing-in-memory (PIM) paradigm, where computation is placed near or within memory arrays, is a viable solution to accelerate memory-bound NNs. However, PIM architectures vary in form, where different PIM approaches lead to different trade-offs. Our goal is to analyze, discuss, and contrast DRAM-based PIM architectures for NN performance and energy efficiency. To do so, we analyze three state-of-the-art PIM architectures: (1) UPMEM, which integrates processors and DRAM arrays into a single 2D chip; (2) Mensa, a 3D-stack-based PIM architecture tailored for edge devices; and (3) SIMDRAM, which uses the analog principles of DRAM to execute bit-serial operations. Our analysis reveals that PIM greatly benefits…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Memory and Neural Computing · Ferroelectric and Negative Capacitance Devices · Advanced Neural Network Applications