FPGA-Accelerated SpeckleNN with SNL for Real-time X-ray Single-Particle Imaging
Abhilasha Dave, Cong Wang, James Russell, Ryan Herbst, and Jana Thayer

TL;DR
This paper presents an FPGA-accelerated SpeckleNN model optimized for real-time speckle pattern classification in X-ray Single-Particle Imaging, achieving significant speed and power efficiency improvements over GPU implementations.
Contribution
It introduces a specialized, highly compressed SpeckleNN model with dynamic weight loading via SNL, enabling real-time inference without FPGA re-synthesis.
Findings
8.9x faster inference than GPU
7.8x power reduction compared to GPU
Achieved 45.015 microseconds latency at 200 MHz
Abstract
We implement a specialized version of our SpeckleNN model for real-time speckle pattern classification in X-ray Single-Particle Imaging (SPI) using the SLAC Neural Network Library (SNL) on an FPGA. This hardware is optimized for inference near detectors in high-throughput X-ray free-electron laser (XFEL) facilities like the Linac Coherent Light Source (LCLS). To fit FPGA constraints, we optimized SpeckleNN, reducing parameters from 5.6M to 64.6K (98.8% reduction) with 90% accuracy. We also compressed the latent space from 128 to 50 dimensions. Deployed on a KCU1500 FPGA, the model used 71% of DSPs, 75% of LUTs, and 48% of FFs, with an average power consumption of 9.4W. The FPGA achieved 45.015us inference latency at 200 MHz. On an NVIDIA A100 GPU, the same inference consumed ~73W and had a 400us latency. Our FPGA version achieved an 8.9x speedup and 7.8x power reduction over the GPU.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced X-ray Imaging Techniques · Astrophysical Phenomena and Observations · Particle Detector Development and Performance
MethodsLib
