FINN-GL: Generalized Mixed-Precision Extensions for FPGA-Accelerated LSTMs

Shashwat Khandelwal; Jakoba Petri-Koenig; Thomas B. Preu{\ss}er; Michaela Blott; Shreejith Shanker

arXiv:2506.20810·cs.LG·June 27, 2025

FINN-GL: Generalized Mixed-Precision Extensions for FPGA-Accelerated LSTMs

Shashwat Khandelwal, Jakoba Petri-Koenig, Thomas B. Preu{\ss}er, Michaela Blott, Shreejith Shanker

PDF

Open Access

TL;DR

This paper introduces a generalized FPGA-based deployment framework for LSTMs, enabling mixed-precision quantization and efficient hardware implementation, thus facilitating resource-efficient real-time RNN inference.

Contribution

It extends the FINN framework to support LSTMs using ONNX Scan operator, enabling mixed-precision quantization and hardware mapping for FPGA acceleration.

Findings

01

Achieves a balance between latency and resource use in FPGA LSTM accelerators.

02

Maintains or improves inference accuracy with reduced precision.

03

Demonstrates effectiveness on stock prediction task with FPGA implementation.

Abstract

Recurrent neural networks (RNNs), particularly LSTMs, are effective for time-series tasks like sentiment analysis and short-term stock prediction. However, their computational complexity poses challenges for real-time deployment in resource constrained environments. While FPGAs offer a promising platform for energy-efficient AI acceleration, existing tools mainly target feed-forward networks, and LSTM acceleration typically requires full custom implementation. In this paper, we address this gap by leveraging the open-source and extensible FINN framework to enable the generalized deployment of LSTMs on FPGAs. Specifically, we leverage the Scan operator from the Open Neural Network Exchange (ONNX) specification to model the recurrent nature of LSTM computations, enabling support for mixed quantisation within them and functional verification of LSTM-based models. Furthermore, we introduce…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Numerical Analysis Techniques · Advanced Surface Polishing Techniques · Digital Filter Design and Implementation

MethodsConvolution · ConvLSTM · Long Short-Term Memory