CNN-MERP: An FPGA-Based Memory-Efficient Reconfigurable Processor for Forward and Backward Propagation of Convolutional Neural Networks
Xushen Han, Dajiang Zhou, Shihao Wang, and Shinji Kimura

TL;DR
This paper introduces CNN-MERP, an FPGA-based reconfigurable processor that significantly reduces memory bandwidth needs and boosts CNN processing throughput through optimized memory hierarchy and reconfigurability.
Contribution
The paper presents a novel FPGA-based CNN processor with an efficient memory hierarchy and dual reconfigurability, achieving lower bandwidth requirements and higher throughput than previous solutions.
Findings
Achieves 55% lower external memory bandwidth requirement.
Attains 1244 GFlop/s throughput on Vertex UltraScale FPGA.
Outperforms state-of-the-art FPGA CNN implementations by 5.48 times.
Abstract
Large-scale deep convolutional neural networks (CNNs) are widely used in machine learning applications. While CNNs involve huge complexity, VLSI (ASIC and FPGA) chips that deliver high-density integration of computational resources are regarded as a promising platform for CNN's implementation. At massive parallelism of computational units, however, the external memory bandwidth, which is constrained by the pin count of the VLSI chip, becomes the system bottleneck. Moreover, VLSI solutions are usually regarded as a lack of the flexibility to be reconfigured for the various parameters of CNNs. This paper presents CNN-MERP to address these issues. CNN-MERP incorporates an efficient memory hierarchy that significantly reduces the bandwidth requirements from multiple optimizations including on/off-chip data allocation, data flow optimization and data reuse. The proposed 2-level…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · CCD and CMOS Imaging Sensors · Embedded Systems Design Techniques
