Q-SpiNN: A Framework for Quantizing Spiking Neural Networks
Rachmad Vidya Wicaksana Putra, Muhammad Shafique

TL;DR
Q-SpiNN is a comprehensive quantization framework for Spiking Neural Networks that optimizes memory reduction while maintaining high accuracy, by exploring various quantization strategies and selecting Pareto-optimal models.
Contribution
It introduces a novel framework that quantizes multiple SNN parameters, explores diverse quantization combinations, and efficiently selects models with optimal memory-accuracy trade-offs.
Findings
4x memory reduction on unsupervised SNNs with 1% accuracy loss
2x memory reduction on supervised SNNs with 2% accuracy loss
Effective model selection algorithm for memory-accuracy balance
Abstract
A prominent technique for reducing the memory footprint of Spiking Neural Networks (SNNs) without decreasing the accuracy significantly is quantization. However, the state-of-the-art only focus on employing the weight quantization directly from a specific quantization scheme, i.e., either the post-training quantization (PTQ) or the in-training quantization (ITQ), and do not consider (1) quantizing other SNN parameters (e.g., neuron membrane potential), (2) exploring different combinations of quantization approaches (i.e., quantization schemes, precision levels, and rounding schemes), and (3) selecting the SNN model with a good memory-accuracy trade-off at the end. Therefore, the memory saving offered by these state-of-the-art to meet the targeted accuracy is limited, thereby hindering processing SNNs on the resource-constrained systems (e.g., the IoT-Edge devices). Towards this, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
