TL;DR
WaveSFNet introduces a wavelet-based codec combined with a dual-domain gated network to improve long-range spatiotemporal prediction accuracy while maintaining low computational cost.
Contribution
It proposes a novel framework unifying wavelet-based encoding with dual-domain gating for enhanced spatiotemporal modeling.
Findings
Achieves competitive accuracy on Moving MNIST, TaxiBJ, and WeatherBench datasets.
Maintains low computational complexity compared to existing methods.
Effectively preserves high-frequency details during prediction.
Abstract
Spatiotemporal predictive learning aims to forecast future frames from historical observations in an unsupervised manner, and is critical to a wide range of applications. The key challenge is to model long-range dynamics while preserving high-frequency details for sharp multi-step predictions. Existing efficient recurrent-free frameworks typically rely on strided convolutions or pooling for sampling, which tends to discard textures and boundaries, while purely spatial operators often struggle to balance local interactions with global propagation. To address these issues, we propose WaveSFNet, an efficient framework that unifies a wavelet-based codec with a spatial--frequency dual-domain gated spatiotemporal translator. The wavelet-based codec preserves high-frequency subband cues during downsampling and reconstruction. Meanwhile, the translator first injects adjacent-frame differences…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
