Exploiting temporal parallelism for LSTM Autoencoder acceleration on FPGA

Aimilios Leftheriotis; Dimosthenis Masouros; Dimitrios Soudris; and George Theodoridis

arXiv:2603.13982·cs.AR·March 17, 2026

Exploiting temporal parallelism for LSTM Autoencoder acceleration on FPGA

Aimilios Leftheriotis, Dimosthenis Masouros, Dimitrios Soudris, and George Theodoridis

PDF

Open Access

TL;DR

This paper presents a novel FPGA-based accelerator that exploits temporal parallelism to enable concurrent multi-layer processing of LSTM autoencoders, achieving significant speedups and energy efficiency for real-time anomaly detection.

Contribution

The paper introduces a dataflow FPGA architecture that leverages temporal parallelism for multi-layer LSTM processing, surpassing prior single-layer optimization approaches.

Findings

01

Latency speedup up to 79.6x over CPU

02

Energy reduction up to 1722x compared to CPU

03

Superior scalability for deeper networks

Abstract

Recurrent Neural Networks (RNNs) are vital for sequential data processing. Long Short-Term Memory Autoencoders (LSTM-AEs) are particularly effective for unsupervised anomaly detection in time-series data. However, inherent sequential dependencies limit parallel computation. While previous work has explored FPGA-based acceleration for LSTM networks, efforts have typically focused on optimizing a single LSTM layer at a time. We introduce a novel FPGA-based accelerator using a dataflow architecture that exploits temporal parallelism for concurrent multi-layer processing of different timesteps within sequences. Experimental evaluations on four representative LSTM-AE models with varying widths and depths, implemented on a Zynq UltraScale+ MPSoC FPGA, demonstrate significant advantages over CPU (Intel Xeon Gold 5218R) and GPU (NVIDIA V100) implementations. Our accelerator achieves latency…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Time Series Analysis and Forecasting · Adversarial Robustness in Machine Learning