An exact, cache-localized algorithm for the sub-quadratic convolution of hypercubes
Oliver Serang

TL;DR
This paper introduces an exact, cache-efficient algorithm for hypercube convolution that operates in sub-quadratic time, outperforming traditional FFT-based methods for high-dimensional data.
Contribution
The paper presents a novel sub-quadratic algorithm for exact hypercube convolution, improving efficiency over existing FFT-based approaches in high-dimensional settings.
Findings
Outperforms FFTPACK and FFTW in hypercube convolution tasks
Enables sub-quadratic algorithms for vector convolution variants
Efficiently handles high-dimensional hypercube data
Abstract
Fast multidimensional convolution can be performed naively in quadratic time and can often be performed more efficiently via the Fourier transform; however, when the dimensionality is large, these algorithms become more challenging. A method is proposed for performing exact hypercube convolution in sub-quadratic time. The method outperforms FFTPACK, called via numpy, and FFTW, called via pyfftw) for hypercube convolution. Embeddings in hypercubes can be paired with sub-quadratic hypercube convolution method to construct sub-quadratic algorithms for variants of vector convolution.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Embedded Systems Design Techniques · Interconnection Networks and Systems
