TL;DR
PACiM introduces a probabilistic approximate computation method that leverages sparsity and statistical techniques to significantly improve energy efficiency and reduce data transfer in compute-in-memory systems for neural networks.
Contribution
The paper presents PAC, a novel probabilistic approximation technique that reduces computation error and memory access, enabling a sparsity-centric architecture with high efficiency and accuracy.
Findings
Achieves 4X reduction in approximation error compared to existing methods.
Reduces memory accesses by 50%, boosting system efficiency.
Attains 14.63 TOPS/W peak efficiency in 65 nm CMOS.
Abstract
Approximate computing emerges as a promising approach to enhance the efficiency of compute-in-memory (CiM) systems in deep neural network processing. However, traditional approximate techniques often significantly trade off accuracy for power efficiency, and fail to reduce data transfer between main memory and CiM banks, which dominates power consumption. This paper introduces a novel probabilistic approximate computation (PAC) method that leverages statistical techniques to approximate multiply-and-accumulation (MAC) operations, reducing approximation error by 4X compared to existing approaches. PAC enables efficient sparsity-based computation in CiM systems by simplifying complex MAC vector computations into scalar calculations. Moreover, PAC enables sparsity encoding and eliminates the LSB activations transmission, significantly reducing data reads and writes. This sets PAC apart…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
