A Unified Hardware Architecture for Convolutions and Deconvolutions in CNN
Lin Bai, Yecheng Lyu, Xinming Huang

TL;DR
This paper introduces a scalable, unified hardware architecture for CNNs that efficiently handles both convolution and deconvolution operations using shared resources, optimizing memory access and improving performance on FPGA.
Contribution
It presents a novel unified hardware design that supports both convolution and deconvolution in CNNs, reducing resource duplication and enhancing efficiency.
Findings
Achieves 151.5 GOPS for convolution and 94.3 GOPS for deconvolution.
Successfully implemented on Xilinx ZC706 FPGA with improved performance.
Applicable to various CNNs with deconvolution layers.
Abstract
In this paper, a scalable neural network hardware architecture for image segmentation is proposed. By sharing the same computing resources, both convolution and deconvolution operations are handled by the same process element array. In addition, access to on-chip and off-chip memories is optimized to alleviate the burden introduced by partial sum. As an example, SegNet-Basic has been implemented using the proposed unified architecture by targeting on Xilinx ZC706 FPGA, which achieves the performance of 151.5 GOPS and 94.3 GOPS for convolution and deconvolution respectively. This unified convolution/deconvolution design is applicable to other CNNs with deconvolution.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · CCD and CMOS Imaging Sensors · Advanced Image Processing Techniques
MethodsConvolution
