Pack my weights and run! Minimizing overheads for in-memory computing accelerators
Pouya Houshmand, Marian Verhelst

TL;DR
This paper introduces a novel weight mapping algorithm for in-memory computing accelerators that reduces loading overhead and maximizes resource utilization, leading to significant efficiency improvements in neural network processing.
Contribution
The paper presents a new weight packing algorithm that minimizes loading times and enhances parallelism in IMC accelerators, improving performance for neural network workloads.
Findings
Achieves 10-100x EDP improvements on MLPerf Tiny benchmark
Reduces weight loading times significantly
Enhances resource utilization and parallelism
Abstract
In-memory computing hardware accelerators allow more than 10x improvements in peak efficiency and performance for matrix-vector multiplications (MVM) compared to conventional digital designs. For this, they have gained great interest for the acceleration of neural network workloads. Nevertheless, these potential gains are only achieved when the utilization of the computational resources is maximized and the overhead from loading operands in the memory array minimized. To this aim, this paper proposes a novel mapping algorithm for the weights in the IMC macro, based on efficient packing of the weights of network layers in the available memory. The algorithm realizes 1) minimization of weight loading times while at the same time 2) maximally exploiting the parallelism of the IMC computational fabric. A set of case studies are carried out to show achievable trade-offs for the MLPerf Tiny…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Cloud Computing and Resource Management · Ferroelectric and Negative Capacitance Devices
