Open-source FPGA-ML codesign for the MLPerf Tiny Benchmark
Hendrik Borras, Giuseppe Di Guglielmo, Javier Duarte and, Nicol\`o Ghielmetti, Ben Hawks, Scott Hauck, Shih-Chieh Hsu, Ryan, Kastner, Jason Liang, Andres Meza, Jules Muhizi, Tai Nguyen and, Rushil Roy, Nhan Tran, Yaman Umuroglu, Olivia Weng, Aidan Yokuda, and Michaela Blott

TL;DR
This paper details the development and implementation of open-source FPGA-based neural network solutions for MLPerf Tiny benchmarks, emphasizing democratization, optimization, and performance on various FPGA platforms.
Contribution
It introduces a comprehensive FPGA design workflow using open-source tools for MLPerf Tiny tasks, with new optimizations and adaptable architectures for improved speed and efficiency.
Findings
Achieved inference latencies as low as 20 microseconds.
Energy consumption as low as 30 microjoules per inference.
Deployed solutions on Pynq-Z2 and Arty A7-100T FPGA platforms.
Abstract
We present our development experience and recent results for the MLPerf Tiny Inference Benchmark on field-programmable gate array (FPGA) platforms. We use the open-source hls4ml and FINN workflows, which aim to democratize AI-hardware codesign of optimized neural networks on FPGAs. We present the design and implementation process for the keyword spotting, anomaly detection, and image classification benchmark tasks. The resulting hardware implementations are quantized, configurable, spatial dataflow architectures tailored for speed and efficiency and introduce new generic optimizations and common workflows developed as a part of this work. The full workflow is presented from quantization-aware training to FPGA implementation. The solutions are deployed on system-on-chip (Pynq-Z2) and pure FPGA (Arty A7-100T) platforms. The resulting submissions achieve latencies as low as 20 s and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Neural Networks and Applications · Image Processing Techniques and Applications
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
