QP-SNN: Quantized and Pruned Spiking Neural Networks
Wenjie Wei, Malu Zhang, Zijian Zhou, Ammar Belatreche, Yimeng Shan, Yu Liang, Honglin Cao, Jieyuan Zhang, Yang Yang

TL;DR
This paper introduces QP-SNN, a lightweight, hardware-friendly spiking neural network that combines quantization and pruning techniques to improve efficiency and performance for resource-limited edge devices.
Contribution
The paper presents a novel combination of uniform quantization and structured pruning with new strategies, significantly improving SNN deployment in resource-constrained environments.
Findings
Achieves state-of-the-art performance with reduced storage and computation.
Introduces a weight rescaling strategy for better quantization.
Develops a pruning criterion based on spike activity singular values.
Abstract
Brain-inspired Spiking Neural Networks (SNNs) leverage sparse spikes to encode information and operate in an asynchronous event-driven manner, offering a highly energy-efficient paradigm for machine intelligence. However, the current SNN community focuses primarily on performance improvement by developing large-scale models, which limits the applicability of SNNs in resource-limited edge devices. In this paper, we propose a hardware-friendly and lightweight SNN, aimed at effectively deploying high-performance SNN in resource-limited scenarios. Specifically, we first develop a baseline model that integrates uniform quantization and structured pruning, called QP-SNN baseline. While this baseline significantly reduces storage demands and computational costs, it suffers from performance decline. To address this, we conduct an in-depth analysis of the challenges in quantization and pruning…
Peer Reviews
Decision·ICLR 2025 Poster
1. The illustrations are thoughtfully crafted. 2. The manuscript presents a clear structure, and maintains logical consistency. 3. A detailed analysis of the weight distribution in SNNs provides the motivation for the re-scaling method. 4. The experiments demonstrate the effectiveness of this method on both static and dynamic datasets.
1. Regarding the selection of the quantization scaling factor $\gamma$, this work employs three different sampling methods. Although experiments indicate that scaling with the 1-norm mean yields the best results, I believe there is still room for improvement, such as using a piecewise $\gamma$. I encourage the authors to provide further explanation. 2. To my knowledge, when two or more model lightweigh techniques are employed, compatibility issues often arise, such as the order of applying thes
1. The writing of this paper is good and easy to read. 2. The experiments in the paper demonstrate the effectiveness of the proposed ReScaW algorithm and SVD-based pruning.
1. The ReScaW-based uniform quantization lacks innovation, as it is merely a combination of uniform quantization and weight scaling [1]. 2. The SVD-based pruning appears to be an application of SVD pruning in SNNs. There are many methods utilizing SVD decomposition or selecting important singular values for pruning [2][3]. Please clarify the specific advantages of applying this in SNNs. 3. The SVD decomposition method introduces additional time complexity. How much does it specifically affect
1. The paper develops a hardware-efficient and lightweight QP-SNN baseline by integrating uniform quantization and structured pruning. 2. The weight-rescaling strategy and the SVS-based pruning criterion works well on SNN benchmarks.
1. The proposed technologies including the weight-rescaling strategy and the SVS-based pruning criterion do not fully utilize the unique characteristics of SNN, such as the sparsity to reduce synaptic operation and temporal information to maintain the SNN accuracy. The technologies can also be used in ANN models. 2. The comparison with SNNs that are not compressed is missing, such as SEW-ResNet[1] and MS-ResNet[2]. Compared to these works, the accuracy degradation is still large. 3. The results
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Memory and Neural Computing · Neural Networks and Applications · Neural dynamics and brain function
