Towards Efficient IMC Accelerator Design Through Joint Hardware-Workload Co-optimization
Olga Krestinskaya, Mohammed E. Fouda, Ahmed Eltawil, Khaled N. Salama

TL;DR
This paper introduces a joint hardware-workload optimization framework for in-memory computing accelerators, significantly improving efficiency and flexibility across diverse workloads by optimizing architecture parameters collectively.
Contribution
It presents a novel joint optimization approach that outperforms separate workload-specific designs, enabling more efficient generalized IMC hardware.
Findings
Achieves up to 69% better energy-latency-area scores for various CNNs.
Quantifies performance trade-offs of generalized versus workload-specific IMC hardware.
Demonstrates the effectiveness of joint optimization in designing flexible IMC accelerators.
Abstract
Designing generalized in-memory computing (IMC) hardware that efficiently supports a variety of workloads requires extensive design space exploration, which is infeasible to perform manually. Optimizing hardware individually for each workload or solely for the largest workload often fails to yield the most efficient generalized solutions. To address this, we propose a joint hardware-workload optimization framework that identifies optimised IMC chip architecture parameters, enabling more efficient, workload-flexible hardware. We show that joint optimization achieves 36%, 36%, 20%, and 69% better energy-latency-area scores for VGG16, ResNet18, AlexNet, and MobileNetV3, respectively, compared to the separate architecture parameters search optimizing for a single largest workload. Additionally, we quantify the performance trade-offs and losses of the resulting generalized IMC hardware…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Wireless Network Optimization · Advanced MIMO Systems Optimization · Advanced DC-DC Converters
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Average Pooling · ReLU6 · Depthwise Convolution · Global Average Pooling · Batch Normalization · Hard Swish · 1x1 Convolution · Dense Connections · Convolution
