DYNAMAP: Dynamic Algorithm Mapping Framework for Low Latency CNN Inference
Yuan Meng, Sanmukh Kuppannagari, Rajgopal Kannan, Viktor Prasanna

TL;DR
DYNAMAP is a flexible FPGA-based framework that dynamically maps algorithms to optimize CNN inference latency by leveraging a unified hardware overlay and a novel design space exploration method.
Contribution
It introduces a unified hardware overlay supporting multiple algorithms and a polynomial-time optimization method for dynamic algorithm mapping in CNNs.
Findings
Achieves up to 2.8x speedup on GoogleNet
Achieves up to 1.4x speedup on Inception-V4
Supports diverse CNN layer characteristics efficiently
Abstract
Most of the existing work on FPGA acceleration of Convolutional Neural Network (CNN) focus on employing a single strategy (algorithm, dataflow, etc.) across all the layers. Such an approach does not achieve optimal latency on complex and deep CNNs. Emerging CNNs have diverse per-layer computation characteristics including parallelism, arithmetic intensity, locality, and memory footprint. Per-layer strategy selection and fine-grained tuning are required to achieve low end-to-end latency. However, specialized hardware modules dedicated to each layer limit the per-layer utilization and adversely affect end-to-end latency. In this paper, we address these problems by an algorithm-architecture co-optimization framework, DYNAMAP, consisting of (1) a unified hardware overlay that can be reused across layers, supporting dynamic mapping of all three families of popular convolution algorithms, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
