TL;DR
PENGUIN introduces a novel periodic-nested group attention mechanism to improve long-term time series forecasting with Transformers by explicitly modeling periodic structures.
Contribution
It proposes a new attention mechanism that captures multiple periodicities and enhances Transformer performance for long-term time series forecasting.
Findings
PENGUIN outperforms existing models on diverse benchmarks.
The periodic-aware bias improves the modeling of periodic structures.
Grouped multi-query attention handles multiple coexisting periodicities effectively.
Abstract
Despite advances in the Transformer architecture, their effectiveness for long-term time series forecasting (LTSF) remains controversial. In this paper, we investigate the potential of integrating explicit periodicity modeling into the self-attention mechanism to enhance the performance of Transformer-based architectures for LTSF. Specifically, we propose PENGUIN, a simple yet effective periodic-nested group attention mechanism. Our approach introduces a periodic-aware relative attention bias to directly capture periodic structures and a grouped multi-query attention mechanism to handle multiple coexisting periodicities (e.g., daily and weekly cycles) within time series data. Extensive experiments across diverse benchmarks demonstrate that PENGUIN consistently outperforms both MLP-based and Transformer-based models. Code is available at https://github.com/ysygMhdxw/AISTATS2026_PENGUIN.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
