Design Rules for Extreme-Edge Scientific Computing on AI Engines
Zhenghua Ma, G Abarajithan, Dimitrios Danopoulos, Olivia Weng, Francesco Restuccia, Ryan Kastner

TL;DR
This paper evaluates the use of AI Engines versus programmable logic for extreme-edge scientific neural networks, providing architectural insights, optimization strategies, and demonstrating successful deployments on FPGA SoCs.
Contribution
It introduces the LARE metric for comparing AI Engines and programmable logic, and offers dataflow optimizations for low-latency scientific inference.
Findings
AI Engines outperform programmable logic for larger models based on LARE metric.
Spatial and API-level dataflow optimizations improve inference latency.
End-to-end neural networks that don't fit on programmable logic can be deployed on AI Engines.
Abstract
Extreme-edge scientific applications use machine learning models to analyze sensor data and make real-time decisions. Their stringent latency and throughput requirements demand small batch sizes and require that model weights remain fully on-chip. Spatial dataflow implementations are common for extreme-edge applications. Spatial dataflow works well for small networks, but it fails to scale to larger models due to inherent resource scaling limitations. AI Engines on modern FPGA SoCs offer a promising alternative with high compute density and additional on-chip memory. However, the architecture, programming model, and performance-scaling behavior of AI Engines differ fundamentally from those of the programmable logic, making direct comparison non-trivial and the benefits of using AI Engines unclear. This work addresses how and when extreme-edge scientific neural networks should be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
