RCW-CIM: A Digital CIM-based LLM Accelerator with Read-Compute/Write

Yan-Cheng Guo; Tian-Sheuan Chang; and Jian-Wei Su

arXiv:2604.27384·cs.AR·May 1, 2026

RCW-CIM: A Digital CIM-based LLM Accelerator with Read-Compute/Write

Yan-Cheng Guo, Tian-Sheuan Chang, and Jian-Wei Su

PDF

TL;DR

This paper introduces RCW-CIM, a novel digital CIM-based LLM accelerator that minimizes weight update latency and improves overall performance through innovative architecture and dataflow optimizations.

Contribution

It proposes a read-compute/write architecture with nonlinear operator fusion and a new dataflow, significantly reducing latency and DRAM access for LLM acceleration.

Findings

01

Decoding latency reduced by 21.59% on Llama2-7B.

02

Latency reduced by 69.17% through nonlinear operator fusion.

03

DRAM access and CIM weight updates reduced by 51.6% and 87.6%.

Abstract

Digital computing-in-memory (DCIM) has emerged as a promising solution for large language model (LLM) acceleration by minimizing data transfers between external DRAM and on-chip accelerators while maintaining high precision for superior accuracy. However, existing CIM architectures often overlook weight update latency, which becomes critical as LLM weights are far larger than a single CIM macro capacity. To address this issue, this paper proposes a read-compute/write (RCW) architecture that effectively minimizes weight update latency, along with a nonlinear operator fusion that further mitigates dependencyinduced latency. The proposed RCW reduces decoding computing latency by 21.59% on the Llama2-7B model. In addition, the nonlinear operator fusion mechanism achieves a 69.17% latency reduction through efficient partial accumulation and group-based approximation. Furthermore, a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.