Can Deep Neural Networks be Converted to Ultra Low-Latency Spiking Neural Networks?
Gourav Datta, Peter A. Beerel

TL;DR
This paper introduces a new training method for converting deep neural networks into ultra low-latency spiking neural networks, significantly reducing energy consumption and inference time while maintaining accuracy.
Contribution
The authors develop an accurate distribution-aware training algorithm that enables ultra low-latency SNNs with high sparsity, outperforming existing conversion strategies.
Findings
Achieved 64.19% top-1 accuracy with 2 time steps on CIFAR-100.
Reduced compute energy by approximately 159.2x compared to DNNs.
Inference speed improved by 2.5-8x over other SOTA SNN models.
Abstract
Spiking neural networks (SNNs), that operate via binary spikes distributed over time, have emerged as a promising energy efficient ML paradigm for resource-constrained devices. However, the current state-of-the-art (SOTA) SNNs require multiple time steps for acceptable inference accuracy, increasing spiking activity and, consequently, energy consumption. SOTA training strategies for SNNs involve conversion from a non-spiking deep neural network (DNN). In this paper, we determine that SOTA conversion strategies cannot yield ultra low latency because they incorrectly assume that the DNN and SNN pre-activation values are uniformly distributed. We propose a new training algorithm that accurately captures these distributions, minimizing the error between the DNN and converted SNN. The resulting SNNs have ultra low latency and high activation sparsity, yielding significant improvements in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Memory and Neural Computing · Ferroelectric and Negative Capacitance Devices · Neural dynamics and brain function
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Average Pooling · Residual Connection · Global Average Pooling · Batch Normalization · 1x1 Convolution · Residual Block · Bottleneck Residual Block · Kaiming Initialization · Convolution
