PIM-malloc: A Fast and Scalable Dynamic Memory Allocator for Processing-In-Memory (PIM) Architectures
Dongjae Lee, Bongjoon Hyun, Youngjin Kwon, Minsoo Rhu

TL;DR
This paper introduces PIM-malloc, a high-performance, scalable dynamic memory allocator tailored for processing-in-memory architectures, significantly improving allocation speed and programmability on real PIM hardware.
Contribution
PIM-malloc is the first scalable dynamic memory allocator designed specifically for general-purpose PIM architectures, with innovative metadata management and hardware cache optimizations.
Findings
Achieves 66x faster memory allocation performance
Adds 31% performance improvement with a lightweight hardware cache
Effectively supports diverse PIM workloads for better programmability
Abstract
The ability to dynamically allocate memory is fundamental in modern programming languages. However, this feature is not adequately supported in current general-purpose PIM devices. To identify key design principles that PIM must consider, we conduct a design space exploration of PIM memory allocators, examining various strategies for metadata placement and management of the allocator. Based on this exploration, we introduce PIM-malloc, a fast and scalable memory allocator for general-purpose PIM that operates on real PIM hardware, achieving a x66 improvement in memory allocation performance. This design is further enhanced with a lightweight, per-PIM core hardware cache, specifically designed for dynamic memory allocation, achieving an additional 31% performance improvement. Finally, we demonstrate the applicability of PIM-malloc by developing several representative PIM workloads,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Embedded Systems Design Techniques · Network Packet Processing and Optimization
