Fast 2D Convolutions and Cross-Correlations Using Scalable Architectures
Cesar Carranza, Daniel Llamocca, and Marios Pattichis

TL;DR
This paper introduces scalable architectures and algorithms for fast 2D convolutions and cross-correlations by transforming them into 1D operations using DPRT and SVD-LU, optimized for FPGA and Zynq-SOC devices.
Contribution
It presents a novel approach to accelerate 2D convolutions and cross-correlations through transform-based architectures suitable for modern programmable hardware.
Findings
Achieves computation in O(P) to O(P^2) clock cycles depending on resources
Significantly outperforms existing methods on FPGA and Zynq-SOC implementations
Provides scalable architectures adaptable to different resource constraints
Abstract
The manuscript describes fast and scalable architectures and associated algorithms for computing convolutions and cross-correlations. The basic idea is to map 2D convolutions and cross-correlations to a collection of 1D convolutions and cross-correlations in the transform domain. This is accomplished through the use of the Discrete Periodic Radon Transform (DPRT) for general kernels and the use of SVD-LU decompositions for low-rank kernels. The approach uses scalable architectures that can be fitted into modern FPGA and Zynq-SOC devices. Based on different types of available resources, for blocks, 2D convolutions and cross-correlations can be computed in just clock cycles up to clock cycles. Thus, there is a trade-off between performance and required numbers and types of resources. We provide implementations of the proposed architectures using modern…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
