Spartus: A 9.4 TOp/s FPGA-based LSTM Accelerator Exploiting Spatio-Temporal Sparsity
Chang Gao, Tobi Delbruck, Shih-Chii Liu

TL;DR
Spartus is an FPGA-based LSTM accelerator that exploits spatio-temporal sparsity through structured pruning and temporal sparsity extension, achieving high throughput and low latency for speech recognition tasks.
Contribution
This paper introduces Spartus, the first FPGA accelerator to exploit combined spatio-temporal sparsity in LSTMs, with novel pruning and sparsity extension methods.
Findings
Achieves up to 96% weight sparsity with negligible accuracy loss.
Reaches 9.4 TOp/s effective throughput with 1.1 TOp/s/W efficiency.
Latency of 1 microsecond per sample for a 1024-neuron LSTM layer.
Abstract
Long Short-Term Memory (LSTM) recurrent networks are frequently used for tasks involving time-sequential data such as speech recognition. Unlike previous LSTM accelerators that either exploit spatial weight sparsity or temporal activation sparsity, this paper proposes a new accelerator called "Spartus" that exploits spatio-temporal sparsity to achieve ultra-low latency inference. Spatial sparsity is induced using a new Column-Balanced Targeted Dropout (CBTD) structured pruning method, producing structured sparse weight matrices for a balanced workload. The pruned networks running on Spartus hardware achieve weight sparsity levels of up to 96% and 94% with negligible accuracy loss on the TIMIT and the Librispeech datasets. To induce temporal sparsity in LSTM, we extend the previous DeltaGRU method to the DeltaLSTM method. Combining spatio-temporal sparsity with CBTD and DeltaLSTM saves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsPruning · Tanh Activation · Targeted Dropout · Sigmoid Activation · Long Short-Term Memory · Dropout
