L-SPINE: A Low-Precision SIMD Spiking Neural Compute Engine for Resource-efficient Edge Inference
Sonu Kumar, and Mukul Lokhande, and Santosh Kumar Vishvakarma

TL;DR
L-SPINE is a low-precision SIMD hardware engine for spiking neural networks, enabling energy-efficient, real-time edge inference with significant latency and power improvements over traditional platforms.
Contribution
This work introduces a multi-precision, multiplier-less compute engine for SNNs, optimized for FPGA implementation to enhance efficiency and scalability for edge applications.
Findings
Achieves 46.37K LUTs and 30.4K FFs on FPGA with low latency and power.
Reduces inference latency from seconds to milliseconds compared to CPU/GPU.
Quantization to INT2/INT4 maintains accuracy while reducing memory footprint.
Abstract
Spiking Neural Networks (SNNs) offer a promising solution for energy-efficient edge intelligence; however, their hardware deployment is constrained by memory overhead, inefficient scaling operations, and limited parallelism. This work proposes L-SPINE, a low-precision SIMD-enabled spiking neural compute engine for efficient edge inference. The architecture features a unified multi-precision datapath supporting 2-bit, 4-bit, and 8-bit operations, leveraging a multiplier-less shift-add model for neuron dynamics and synaptic accumulation. Implemented on an AMD VC707 FPGA, the proposed neuron requires only 459 LUTs and 408 FFs, achieving a critical delay of 0.39 ns and 4.2 mW power. At the system level, L-SPINE achieves 46.37K LUTs, 30.4K FFs, 2.38 ms latency, and 0.54 W power. Compared to CPU and GPU platforms, it reduces inference latency from seconds to milliseconds, achieving an up to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
