# Out-of-Order Dataflow Scheduling for FPGA Overlays

**Authors:** Siddhartha, Nachiket Kapre

arXiv: 1705.02734 · 2017-05-09

## TL;DR

This paper presents an out-of-order dataflow scheduling method for FPGA overlays that improves performance by up to 50% over FIFO-based approaches, utilizing a novel node tagging and criticality sorting scheme.

## Contribution

It introduces a new out-of-order node scheduling technique for FPGA-based dataflow processors, reducing memory overhead and enhancing performance for large graph workloads.

## Key findings

- Up to 50% performance improvement over FIFO-based scheduling.
- Achieved 250MHz operation on Arria10 FPGA with 300 processors.
- Small ~6% memory overhead due to node labeling scheme.

## Abstract

We exploit floating-point DSPs in the Arria10 FPGA and multi-pumping feature of the M20K RAMs to build a dataflow-driven soft processor fabric for large graph workloads. In this paper, we introduce the idea of out-of-order node scheduling across a large number of local nodes (thousands) per processor by combining an efficient node tagging scheme along with leading-one detector circuits. We use a static one-time node labeling algorithm to sort nodes based on criticality to organize local memory inside each soft processor. This translates to a small ~6% memory overhead. When compared to a memory-expensive FIFO-based first-come-first-serve approach used in previous studies, we deliver up to 50% performance improvement while eliminating the cost of the FIFOs. On the Arria10 10AX115S board, we can create an overlay design of up to 300 processors connected by high bandwidth Hoplite NoC at frequencies up to 250MHz.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1705.02734/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/1705.02734/full.md

## References

2 references — full list in the complete paper: https://tomesphere.com/paper/1705.02734/full.md

---
Source: https://tomesphere.com/paper/1705.02734