RubiConv -- Efficient Boundary-Respecting Convolutions
Linda Friso, Annie Marsden, Xinyi Chen, Arushi Gupta, Peter Bartlett, Mark Braverman, Elad Hazan

TL;DR
RubiConv introduces a new algorithm enabling efficient, boundary-respecting convolutions on packed sequences, significantly improving practical performance of convolutional models for large-scale data.
Contribution
It presents RubiConv, a novel method that overcomes FFT limitations, making boundary-respecting convolutions practical for large, packed sequence data.
Findings
RubiConv achieves substantial speedups over attention and FFT baselines.
The method enables practical long convolutional models on large-scale, real-world data.
It closes the gap between theoretical efficiency and practical performance in sequence modeling.
Abstract
Convolutional architectures have emerged as powerful alternatives to Transformers for sequence modeling. The primary advantage is that they offer improved theoretical sequence length complexity by leveraging the Fast Fourier Transform (FFT). However, this theoretical improvement does not always meaningfully land in practice. One critical obstacle is that applying standard FFTs is not amenable to the large-scale training pipeline wherein data is packed from different sources into a single sequence for hardware efficiency. Indeed, standard FFT algorithms are not easily amenable to document packing. Existing workarounds suffer from severe inefficiencies, crippling the practical performance of convolutional architectures. We close this gap with RubiConv, a novel algorithm for performing hardware-efficient, boundary-respecting convolutions on packed sequences. Extensive experiments show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
