TL;DR
PFGNet is a convolutional framework that adaptively modulates receptive fields using frequency-guided gating, achieving efficient and effective spatiotemporal prediction without recurrence or attention.
Contribution
It introduces a novel frequency-guided gating mechanism with separable large kernels for adaptive receptive fields in convolutional spatiotemporal models.
Findings
Achieves state-of-the-art or near state-of-the-art performance on multiple datasets.
Uses fewer parameters and FLOPs compared to existing methods.
Effectively models structure-aware spatiotemporal data without recurrence or attention.
Abstract
Spatiotemporal predictive learning (STPL) aims to forecast future frames from past observations and is essential across a wide range of applications. Compared with recurrent or hybrid architectures, pure convolutional models offer superior efficiency and full parallelism, yet their fixed receptive fields limit their ability to adaptively capture spatially varying motion patterns. Inspired by biological center-surround organization and frequency-selective signal processing, we propose PFGNet, a fully convolutional framework that dynamically modulates receptive fields through pixel-wise frequency-guided gating. The core Peripheral Frequency Gating (PFG) block extracts localized spectral cues and adaptively fuses multi-scale large-kernel peripheral responses with learnable center suppression, effectively forming spatially adaptive band-pass filters. To maintain efficiency, all large…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
