Inclusive-PIM: Hardware-Software Co-design for Broad Acceleration on Commercial PIM Architectures
Johnathan Alsop, Shaizeen Aga, Mohamed Ibrahim, Mahzabeen Islam,, Andrew Mccrabb, Nuwan Jayasena

TL;DR
This paper evaluates commercial PIM architectures for accelerating a broad range of primitives across domains, identifies bottlenecks, and proposes hardware-software co-design optimizations to enhance their performance.
Contribution
It introduces a PIM-amenability-test for assessing primitives and proposes co-design strategies to improve PIM performance beyond ML-focused designs.
Findings
Commercial PIMs do not fully realize their potential for diverse primitives.
Hardware-software co-design can significantly improve PIM speedups.
Average PIM speedup increased from 1.12x to 2.49x with proposed optimizations.
Abstract
Continual demand for memory bandwidth has made it worthwhile for memory vendors to reassess processing in memory (PIM), which enables higher bandwidth by placing compute units in/near-memory. As such, memory vendors have recently proposed commercially viable PIM designs. However, these proposals are largely driven by the needs of (a narrow set of) machine learning (ML) primitives. While such proposals are reasonable given the the growing importance of ML, as memory is a pervasive component, %in this work, we make there is a case for a more inclusive PIM design that can accelerate primitives across domains. In this work, we ascertain the capabilities of commercial PIM proposals to accelerate various primitives across domains. We first begin with outlining a set of characteristics, termed PIM-amenability-test, which aid in assessing if a given primitive is likely to be accelerated by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFerroelectric and Negative Capacitance Devices · Advanced Data Storage Technologies · Parallel Computing and Optimization Techniques
