TL;DR
Gateformer introduces a novel Transformer-based approach that models both temporal and variate dependencies in multivariate time series, using gating mechanisms to improve accuracy and efficiency, achieving state-of-the-art results on multiple datasets.
Contribution
It re-purposes the Transformer architecture with variate-wise embeddings and gating operations to effectively capture cross-time and cross-variate dependencies in multivariate time series forecasting.
Findings
Achieves state-of-the-art performance on 13 datasets.
Improves forecasting accuracy by up to 20.7%.
Seamlessly integrates with existing Transformer and LLM-based forecasters.
Abstract
There has been a recent surge of interest in time series modeling using the Transformer architecture. However, forecasting multivariate time series with Transformer presents a unique challenge as it requires modeling both temporal (cross-time) and variate (cross-variate) dependencies. While Transformer-based models have gained popularity for their flexibility in capturing both sequential and cross-variate relationships, it is unclear how to best integrate these two sources of information in the context of the Transformer architecture while optimizing for both performance and efficiency. We re-purpose the Transformer architecture to effectively model both cross-time and cross-variate dependencies. Our approach begins by embedding each variate independently into a variate-wise representation that captures its cross-time dynamics, and then models cross-variate dependencies through…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsAttention Is All You Need · Linear Layer · Multi-Head Attention · Dense Connections · Dropout · Layer Normalization · Focus · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Softmax
