Accelerating Neural Network Inference with Processing-in-DRAM: From the Edge to the Cloud
Geraldo F. Oliveira, Juan G\'omez-Luna, Saugata Ghose, Amirali, Boroumand, Onur Mutlu

TL;DR
This paper compares DRAM-based processing-in-memory architectures to evaluate their performance and energy efficiency in accelerating neural network inference from edge devices to the cloud.
Contribution
It provides a comprehensive analysis of three state-of-the-art DRAM PIM architectures, highlighting their trade-offs and suitability for different neural network workloads.
Findings
UPMEM outperforms high-end GPU by 23x in memory-bound tasks
Mensa improves energy efficiency and throughput by over 3x compared to Google Edge TPU
SIMDRAM surpasses CPU/GPU by 16.7x/1.4x in binary neural networks
Abstract
Neural networks (NNs) are growing in importance and complexity. A neural network's performance (and energy efficiency) can be bound either by computation or memory resources. The processing-in-memory (PIM) paradigm, where computation is placed near or within memory arrays, is a viable solution to accelerate memory-bound NNs. However, PIM architectures vary in form, where different PIM approaches lead to different trade-offs. Our goal is to analyze, discuss, and contrast DRAM-based PIM architectures for NN performance and energy efficiency. To do so, we analyze three state-of-the-art PIM architectures: (1) UPMEM, which integrates processors and DRAM arrays into a single 2D chip; (2) Mensa, a 3D-stack-based PIM architecture tailored for edge devices; and (3) SIMDRAM, which uses the analog principles of DRAM to execute bit-serial operations. Our analysis reveals that PIM greatly benefits…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Memory and Neural Computing · Ferroelectric and Negative Capacitance Devices · Advanced Neural Network Applications
