In-Pipeline Integration of Digital In-Memory-Computing into RISC-V Vector Architecture to Accelerate Deep Learning

Tommaso Spagnolo; Cristina Silvano; Riccardo Massa; Filippo Grillotti; Thomas Boesch; Giuseppe Desoli

arXiv:2602.01827·cs.AR·February 3, 2026

In-Pipeline Integration of Digital In-Memory-Computing into RISC-V Vector Architecture to Accelerate Deep Learning

Tommaso Spagnolo, Cristina Silvano, Riccardo Massa, Filippo Grillotti, Thomas Boesch, Giuseppe Desoli

PDF

Open Access

TL;DR

This paper presents a novel RISC-V architecture extension integrating digital in-memory computing to significantly accelerate deep learning inference at the edge, achieving high throughput and energy efficiency.

Contribution

It introduces a new pipeline-integrated DIMC unit with custom instructions in RISC-V, enabling efficient deep learning inference acceleration at the edge.

Findings

01

Peak performance of 137 GOP/s on ResNet-50

02

Speedup of 217x over baseline core

03

50x area-normalized speedup near hardware limits

Abstract

Expanding Deep Learning applications toward edge computing demands architectures capable of delivering high computational performance and efficiency while adhering to tight power and memory constraints. Digital In-Memory Computing (DIMC) addresses this need by moving part of the computation directly within memory arrays, significantly reducing data movement and improving energy efficiency. This paper introduces a novel architecture that extends the Vector RISC-V Instruction Set Architecture (ISA) to integrate a tightly coupled DIMC unit directly into the execution stage of the pipeline, to accelerate Deep Learning inference at the edge. Specifically, the proposed approach adds four custom instructions dedicated to data loading, computation, and write-back, enabling flexible and optimal control of the inference execution on the target architecture. Experimental results demonstrate high…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Big Data and Digital Economy · Advanced Memory and Neural Computing