C-LSTM: Enabling Efficient LSTM using Structured Compression Techniques on FPGAs
Shuo Wang, Zhe Li, Caiwen Ding, Bo Yuan, Yanzhi Wang, Qinru Qiu, Yun, Liang

TL;DR
This paper introduces C-LSTM, a structured compression framework for LSTM models on FPGAs that significantly improves performance and energy efficiency by using block-circulant matrices and FFT-based acceleration, with minimal accuracy loss.
Contribution
It proposes a novel structured compression technique using block-circulant matrices and FFT to optimize LSTM inference on FPGAs, outperforming previous unstructured pruning methods.
Findings
Achieves up to 18.8X performance gain
Achieves up to 33.5X energy efficiency improvement
Maintains minimal accuracy degradation
Abstract
Recently, significant accuracy improvement has been achieved for acoustic recognition systems by increasing the model size of Long Short-Term Memory (LSTM) networks. Unfortunately, the ever-increasing size of LSTM model leads to inefficient designs on FPGAs due to the limited on-chip resources. The previous work proposes to use a pruning based compression technique to reduce the model size and thus speedups the inference on FPGAs. However, the random nature of the pruning technique transforms the dense matrices of the model to highly unstructured sparse ones, which leads to unbalanced computation and irregular memory accesses and thus hurts the overall performance and energy efficiency. In contrast, we propose to use a structured compression technique which could not only reduce the LSTM model size but also eliminate the irregularities of computation and memory accesses. This approach…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Speech Recognition and Synthesis
MethodsPruning · Sigmoid Activation · Tanh Activation · Long Short-Term Memory
