Joint Hardware-Workload Co-Optimization for In-Memory Computing Accelerators
Olga Krestinskaya, Mohammed E. Fouda, Ahmed Eltawil, Khaled N. Salama

TL;DR
This paper introduces a co-optimization framework for designing in-memory computing accelerators that efficiently support multiple neural network workloads, balancing performance and generality through an evolutionary algorithm.
Contribution
It presents a novel joint hardware-workload co-optimization method that enhances the generality of IMC accelerators across diverse neural network models.
Findings
Achieves up to 76.2% EDAP reduction across 4 workloads.
Achieves up to 95.5% EDAP reduction across 9 workloads.
Demonstrates robustness on RRAM and SRAM-based architectures.
Abstract
Software-hardware co-design is essential for optimizing in-memory computing (IMC) hardware accelerators for neural networks. However, most existing optimization frameworks target a single workload, leading to highly specialized hardware designs that do not generalize well across models and applications. In contrast, practical deployment scenarios require a single IMC platform that can efficiently support multiple neural network workloads. This work presents a joint hardware-workload co-optimization framework based on an optimized evolutionary algorithm for designing generalized IMC accelerator architectures. By explicitly capturing cross-workload trade-offs rather than optimizing for a single model, the proposed approach significantly reduces the performance gap between workload-specific and generalized IMC designs. The framework is evaluated on both RRAM- and SRAM-based IMC…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Embedded Systems Design Techniques · Advanced Memory and Neural Computing
