ELSA: An ELastic SNN Inference Architecture for Efficient Neuromorphic Computing

Kang You; Chen Nie; Lee Jun Yan; Ziling Wei; Cheng Zou; Zekai Xu; Yu Feng; Honglan Jiang; Zhezhi He

arXiv:2605.20802·cs.AR·May 21, 2026

ELSA: An ELastic SNN Inference Architecture for Efficient Neuromorphic Computing

Kang You, Chen Nie, Lee Jun Yan, Ziling Wei, Cheng Zou, Zekai Xu, Yu Feng, Honglan Jiang, Zhezhi He

PDF

TL;DR

ELSA is a novel hardware architecture that enables true elastic inference in spiking neural networks, significantly reducing latency and improving energy efficiency by streaming outputs immediately upon production.

Contribution

ELSA introduces a fine-grained spine/token-wise pipeline and hardware optimizations for SNNs, enabling early responses and efficient event-driven computation.

Findings

01

ELSA achieves 3.4× speedup over the SOTA QANN accelerator.

02

ELSA attains 13.6× higher energy efficiency compared to the SOTA QANN.

03

ELSA outperforms the SOTA SNN accelerator with 2.9× speedup and 22.1× energy efficiency gains.

Abstract

Spiking neural networks (SNNs) exploit event-driven and addition-only computation to substantially improve efficiency for intelligent computation. A key temporal property of SNNs, elastic inference, allows outputs to emerge progressively, enabling responses to salient inputs much earlier than full evaluation. However, existing SNN-specific accelerators cannot capitalize on this property. Layer-by-layer designs emit outputs only after all layers are complete, while time-step-by-time-step designs rely on coarse-grained, layer-wise pipelines that require synchronizing all spines/tokens within a layer. This barrier prevents results from being forwarded immediately, delaying the earliest possible response and forfeiting the benefits of elastic inference. To address these challenges, we propose ELSA, a near-SRAM dataflow architecture that realizes true elastic inference through a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.