Minimal Convolutional RNNs Accelerate Spatiotemporal Learning
Co\c{s}ku Can Horuz, Sebastian Otte, Martin V. Butz, Matthias Karlbauer

TL;DR
This paper introduces minimal convolutional RNN architectures that enable fully parallel training and achieve superior speed and accuracy in spatiotemporal forecasting tasks, combining efficiency with effective spatial modeling.
Contribution
The paper presents MinConvLSTM and MinConvGRU, novel convolutional RNNs that extend log-domain prefix-sum formulations for parallel training and incorporate exponential gating for improved efficiency.
Findings
Significantly faster training compared to standard ConvRNNs.
Lower prediction errors in Navier-Stokes and geopotential data.
Reduced parameter count and enhanced scalability.
Abstract
We introduce MinConvLSTM and MinConvGRU, two novel spatiotemporal models that combine the spatial inductive biases of convolutional recurrent networks with the training efficiency of minimal, parallelizable RNNs. Our approach extends the log-domain prefix-sum formulation of MinLSTM and MinGRU to convolutional architectures, enabling fully parallel training while retaining localized spatial modeling. This eliminates the need for sequential hidden state updates during teacher forcing - a major bottleneck in conventional ConvRNN models. In addition, we incorporate an exponential gating mechanism inspired by the xLSTM architecture into the MinConvLSTM, which further simplifies the log-domain computation. Our models are structurally minimal and computationally efficient, with reduced parameter count and improved scalability. We evaluate our models on two spatiotemporal forecasting tasks:…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
