TwinFormer: A Dual-Level Transformer for Long-Sequence Time-Series Forecasting
Mahima Kumavat, Aditya Maheshwari

TL;DR
TwinFormer introduces a hierarchical Transformer architecture with local and global attention mechanisms for efficient long-sequence time-series forecasting, achieving state-of-the-art results across diverse datasets.
Contribution
The paper proposes a novel dual-level Transformer architecture that combines local sparse attention and global inter-patch modeling for improved long-term forecasting.
Findings
Outperforms existing models on 8 real-world datasets
Achieves linear time and memory complexity
Demonstrates superior accuracy in MAE and RMSE metrics
Abstract
TwinFormer is a hierarchical Transformer for long-sequence time-series forecasting. It divides the input into non-overlapping temporal patches and processes them in two stages: (1) a Local Informer with top- Sparse Attention models intra-patch dynamics, followed by mean pooling; (2) a Global Informer captures long-range inter-patch dependencies using the same top- attention. A lightweight GRU aggregates the globally contextualized patch tokens for direct multi-horizon prediction. The resulting architecture achieves linear time and memory complexity. On eight real-world benchmarking datasets from six different domains, including weather, stock price, temperature, power consumption, electricity, and disease, and forecasting horizons , TwinFormer secures positions in the top two out of . Out of the , it achieves the best performance on MAE and RMSE at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTraffic Prediction and Management Techniques · Time Series Analysis and Forecasting · Stock Market Forecasting Methods
