CD-PIM: A High-Bandwidth and Compute-Efficient LPDDR5-Based PIM for Low-Batch LLM Acceleration on Edge-Device
Ye Lin, Chao Fang, Xiaoyong Song, Qi Wu, Anying Jiang, Yichuan Bai, Li Du

TL;DR
This paper introduces CD-PIM, a novel PIM architecture for edge devices that significantly accelerates low-batch LLM inference by improving bandwidth, utilization, and compute efficiency, enabling faster GEMV operations.
Contribution
The paper presents three key innovations: a high-bandwidth mode, a low-batch utilization mode, and a pipelined compute-efficient core for PIM-based LLM acceleration.
Findings
Achieves 11.42x speedup over GPU baseline in HBCEM mode.
Attains 4.25x speedup over state-of-the-art PIM designs.
Provides 1.12x speedup in low-batch scenarios with LBIM.
Abstract
Edge deployment of low-batch large language models (LLMs) faces critical memory bandwidth bottlenecks when executing memory-intensive general matrix-vector multiplications (GEMV) operations. While digital processing-in-memory (PIM) architectures promise to accelerate GEMV operations, existing PIM-equipped edge devices still suffer from three key limitations: limited bandwidth improvement, component under-utilization in mixed workloads, and low compute capacity of computing units (CUs). In this paper, we propose CD-PIM to address these challenges through three key innovations. First, we introduce a high-bandwidth compute-efficient mode (HBCEM) that enhances bandwidth by dividing each bank into four pseudo-banks through segmented global bitlines. Second, we propose a low-batch interleaving mode (LBIM) to improve component utilization by overlapping GEMV operations with GEMM operations.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Big Data and Digital Economy · Ferroelectric and Negative Capacitance Devices
