Exploring Parallelism in FPGA-Based Accelerators for Machine Learning Applications
Sed Centeno, Christopher Sprague, Arnab A Purkayastha, Ray Simar, Neeraj Magotra

TL;DR
This paper investigates the implementation of speculative backpropagation for neural network training on FPGA accelerators, demonstrating significant speedups and maintained accuracy through parallel execution and threshold-based updates.
Contribution
It introduces a FPGA-oriented implementation of speculative backpropagation, showcasing its potential for hardware acceleration and improved training efficiency.
Findings
Maximum 24% speedup in execution time on CPU
Accuracy within 3-4% of baseline
35% speedup in step execution time
Abstract
Speculative backpropagation has emerged as a promising technique to accelerate the training of neural networks by overlapping the forward and backward passes. Leveraging speculative weight updates when error gradients fall within a specific threshold reduces training time without substantially compromising accuracy. In this work, we implement speculative backpropagation on the MNIST dataset using OpenMP as the parallel programming platform. OpenMP's multi-threading capabilities enable simultaneous execution of forward and speculative backpropagation steps, significantly improving training speed. The application is planned for synthesis on a state-of-the-art FPGA to demonstrate its potential for hardware acceleration. Our CPU-based experimental results demonstrate that speculative backpropagation achieves a maximum speedup of 24% in execution time when using a threshold of 0.25, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Numerical Methods and Algorithms · Wireless Signal Modulation Classification
