GradPIM: A Practical Processing-in-DRAM Architecture for Gradient Descent
Heesu Kim, Hanmin Park, Taehyun Kim, Kwanheum Cho, Eojin Lee, Soojung, Ryu, Hyuk-Jae Lee, Kiyoung Choi, Jinho Lee

TL;DR
GradPIM introduces a practical processing-in-DRAM architecture that accelerates deep neural network training by enhancing memory bandwidth and performance with minimal protocol and area overhead.
Contribution
It proposes a simple, incremental PIM architecture extending DDR4 SDRAM to improve DNN training efficiency without protocol invasion.
Findings
Significant performance improvement in DNN training.
Substantial reduction in memory bandwidth usage.
Minimal overhead to memory protocol and DRAM area.
Abstract
In this paper, we present GradPIM, a processing-in-memory architecture which accelerates parameter updates of deep neural networks training. As one of processing-in-memory techniques that could be realized in the near future, we propose an incremental, simple architectural design that does not invade the existing memory protocol. Extending DDR4 SDRAM to utilize bank-group parallelism makes our operation designs in processing-in-memory (PIM) module efficient in terms of hardware cost and performance. Our experimental results show that the proposed architecture can improve the performance of DNN training and greatly reduce memory bandwidth requirement while posing only a minimal amount of overhead to the protocol and DRAM area.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Memory and Neural Computing · Ferroelectric and Negative Capacitance Devices · Advanced Neural Network Applications
