DARTH-PUM: A Hybrid Processing-Using-Memory Architecture
Ryan Wong, Ben Feinberg, Saugata Ghose

TL;DR
DARTH-PUM is a hybrid processing-using-memory architecture that combines analog and digital PUM to enable efficient, scalable execution of diverse kernels, including cryptography and neural networks.
Contribution
It introduces a practical hybrid PUM architecture with optimized hardware and software, enabling general-purpose in-memory computation beyond machine learning inference.
Findings
Achieves 59.4x speedup for AES encryption
Provides 14.8x speedup for CNNs
Delivers 40.8x speedup for large language models
Abstract
Analog processing-using-memory (PUM; a.k.a. in-memory computing) makes use of electrical interactions inside memory arrays to perform bulk matrix-vector multiplication (MVM) operations. However, many popular matrix-based kernels need to execute non-MVM operations, which analog PUM cannot directly perform. To retain its energy efficiency, analog PUM architectures augment memory arrays with CMOS-based domain-specific fixed-function hardware to provide complete kernel functionality, but the difficulty of integrating such specialized CMOS logic with memory arrays has largely limited analog PUM to being an accelerator for machine learning inference, or for closely related kernels. An opportunity exists to harness analog PUM for general-purpose computation: recent works have shown that memory arrays can also perform Boolean PUM operations, albeit with very different supporting hardware and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
