Spiking Brain Compression: Post-Training Second-order Compression for Spiking Neural Networks
Lianfeng Shi, Ao Li, Benjamin Ward-Cherrier

TL;DR
This paper introduces Spiking Brain Compression (SBC), a novel one-shot post-training compression method for SNNs that extends the Optimal Brain Surgeon technique, enabling efficient pruning and quantization with minimal retraining.
Contribution
The paper presents SBC, a new one-shot compression framework for SNNs that significantly reduces training costs and improves compression performance compared to existing iterative methods.
Findings
Achieves state-of-the-art one-shot compression for SNNs on multiple datasets.
Provides accuracy gains of single- to double-digits over ANN baselines.
Demonstrates robust performance with minimal calibration data.
Abstract
Spiking Neural Networks (SNNs) have emerged as a new generation of energy-efficient neural networks suitable for implementation on neuromorphic hardware. As neuromorphic hardware has limited memory and computational resources, parameter pruning and quantization have recently been explored to improve the efficiency of SNNs. State-of-the-art SNN pruning/quantization methods employ multiple compression and training iterations, increasing the cost for pre-trained or very large SNNs. In this paper, we propose a novel one-shot post-training compression framework, Spiking Brain Compression (SBC), that extends the classical Optimal Brain Surgeon method to SNNs. SBC replaces the current-based objective found in the common layer-wise compression method with a spike-train-based objective whose Hessian is cheaply computable, allowing a single backward pass to compress parameters and analytically…
Peer Reviews
Decision·Submitted to ICLR 2026
1.The paper introduces a highly innovative approach by adapting a second-order compression framework to SNNs. The use of a spike-train-based objective function derived from the Van Rossum Distance, coupled with the efficient Surrogate Membrane Potential (SMP) Hessian approximation, provides a theoretically sound and effective solution for this challenging problem. 2.The method demonstrates SOTA results for one-shot, post-training SNN compression. By achieving high accuracy while drastically red
1. Mismatch between claims and experimental validation on Spiking Transformers The paper repeatedly claims its applicability to complex SNNs, including Spiking Transformers (e.g., line 053, line 089, and a detailed derivation in Appendix B.2). However, the experimental section provides no results on any Spiking Transformer architecture; validation is limited to CNN and ResNet-based models. 2. The motivation for SNN quantization is not unclear The introduction (lines 062) states that PTQ for SNN
Originality 1. Extends second-order compression theory OBS into the spiking domain 2. Introduces a VRD-based spike-train similarity loss and surrogate Hessian (SMP), aligning compression with temporal spike dynamics 3. Demonstrates scalable one-shot compression for large SNNs such as Spiking ResNets and Spiking Transformers. Quality Theoretical derivations are mathematically rigorous and well-grounded in prior compression literature. Experimental results are thorough and convincing: multiple da
1. Limited novelty. SBC extends second-order post-training compression (OBS/OBC) to the spiking domain by introducing a spike-train–based loss (VRD) and an efficient surrogate Hessian (SMP), enabling accurate one-shot pruning and quantization of large SNNs without retraining. However, it’s important to note that this novelty is incremental rather than fundamental it’s an adaptation of existing ANN second-order frameworks to SNNs, not a new theoretical paradigm. The main creative step is the inte
1. Comprehensive biologically inspired design: The hierarchical compression strategy (structure, dynamics, and learning) provides an elegant conceptual alignment with real neural systems. 2. Strong empirical performance achieves ultra-high sparsity with minimal accuracy loss and demonstrates superior energy efficiency and spike sparsity compared to state-of-the-art baselines on diverse datasets. 3. Ablation studies isolate contributions of each compression level, which is easy for readers to und
1. While biologically motivated, the paper lacks formal theoretical proofs regarding convergence, optimality, or stability of the multi-level compression process. 2. The multi-level, multi-phase training pipeline is computationally expensive, involving multiple forward passes and adaptation stages. 3. There are no discussions of wall-clock training time or scalability to large models (e.g., ImageNet-scale SNNs).
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Memory and Neural Computing · Neural dynamics and brain function · Ferroelectric and Negative Capacitance Devices
