Joint Hardware-Workload Co-Optimization for In-Memory Computing Accelerators

Olga Krestinskaya; Mohammed E. Fouda; Ahmed Eltawil; Khaled N. Salama

arXiv:2603.03880·cs.AR·March 5, 2026

Joint Hardware-Workload Co-Optimization for In-Memory Computing Accelerators

Olga Krestinskaya, Mohammed E. Fouda, Ahmed Eltawil, Khaled N. Salama

PDF

Open Access

TL;DR

This paper introduces a co-optimization framework for designing in-memory computing accelerators that efficiently support multiple neural network workloads, balancing performance and generality through an evolutionary algorithm.

Contribution

It presents a novel joint hardware-workload co-optimization method that enhances the generality of IMC accelerators across diverse neural network models.

Findings

01

Achieves up to 76.2% EDAP reduction across 4 workloads.

02

Achieves up to 95.5% EDAP reduction across 9 workloads.

03

Demonstrates robustness on RRAM and SRAM-based architectures.

Abstract

Software-hardware co-design is essential for optimizing in-memory computing (IMC) hardware accelerators for neural networks. However, most existing optimization frameworks target a single workload, leading to highly specialized hardware designs that do not generalize well across models and applications. In contrast, practical deployment scenarios require a single IMC platform that can efficiently support multiple neural network workloads. This work presents a joint hardware-workload co-optimization framework based on an optimized evolutionary algorithm for designing generalized IMC accelerator architectures. By explicitly capturing cross-workload trade-offs rather than optimizing for a single model, the proposed approach significantly reduces the performance gap between workload-specific and generalized IMC designs. The framework is evaluated on both RRAM- and SRAM-based IMC…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Embedded Systems Design Techniques · Advanced Memory and Neural Computing