RubiConv -- Efficient Boundary-Respecting Convolutions

Linda Friso; Annie Marsden; Xinyi Chen; Arushi Gupta; Peter Bartlett; Mark Braverman; Elad Hazan

arXiv:2605.08451·cs.LG·May 12, 2026

RubiConv -- Efficient Boundary-Respecting Convolutions

Linda Friso, Annie Marsden, Xinyi Chen, Arushi Gupta, Peter Bartlett, Mark Braverman, Elad Hazan

PDF

TL;DR

RubiConv introduces a new algorithm enabling efficient, boundary-respecting convolutions on packed sequences, significantly improving practical performance of convolutional models for large-scale data.

Contribution

It presents RubiConv, a novel method that overcomes FFT limitations, making boundary-respecting convolutions practical for large, packed sequence data.

Findings

01

RubiConv achieves substantial speedups over attention and FFT baselines.

02

The method enables practical long convolutional models on large-scale, real-world data.

03

It closes the gap between theoretical efficiency and practical performance in sequence modeling.

Abstract

Convolutional architectures have emerged as powerful alternatives to Transformers for sequence modeling. The primary advantage is that they offer improved theoretical sequence length complexity by leveraging the Fast Fourier Transform (FFT). However, this theoretical improvement does not always meaningfully land in practice. One critical obstacle is that applying standard FFTs is not amenable to the large-scale training pipeline wherein data is packed from different sources into a single sequence for hardware efficiency. Indeed, standard FFT algorithms are not easily amenable to document packing. Existing workarounds suffer from severe inefficiencies, crippling the practical performance of convolutional architectures. We close this gap with RubiConv, a novel algorithm for performing hardware-efficient, boundary-respecting convolutions on packed sequences. Extensive experiments show that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.