Hardware Efficient Approximate Convolution with Tunable Error Tolerance for CNNs
Vishal Shashidhar, Anupam Kumari, Roy P Paily

TL;DR
This paper introduces a hardware-efficient approximate convolution method using soft sparsity and a custom RISC-V instruction, significantly reducing computations and power consumption in CNNs without accuracy loss.
Contribution
It proposes a novel soft sparsity paradigm with a custom MSB proxy, enabling efficient skipping of negligible multiplications in CNNs on hardware.
Findings
Reduced ReLU MACs by 88.42% with no accuracy loss
Reduced Tanh MACs by 74.87% with no accuracy loss
Estimated power savings of over 29% through clock-gating
Abstract
Modern CNNs' high computational demands hinder edge deployment, as traditional ``hard'' sparsity (skipping mathematical zeros) loses effectiveness in deep layers or with smooth activations like Tanh. We propose a ``soft sparsity'' paradigm using a hardware efficient Most Significant Bit (MSB) proxy to skip negligible non-zero multiplications. Integrated as a custom RISC-V instruction and evaluated on LeNet-5 (MNIST), this method reduces ReLU MACs by 88.42% and Tanh MACs by 74.87% with zero accuracy loss--outperforming zero-skipping by 5x. By clock-gating inactive multipliers, we estimate power savings of 35.2% for ReLU and 29.96% for Tanh. While memory access makes power reduction sub-linear to operation savings, this approach significantly optimizes resource-constrained inference.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
