First De-Trend then Attend: Rethinking Attention for Time-Series Forecasting
Xiyuan Zhang, Xiaoyong Jin, Karthick Gopalswamy, Gaurav Gupta,, Youngsuk Park, Xingjian Shi, Hao Wang, Danielle C. Maddix, Yuyang Wang

TL;DR
This paper investigates the relationships between attention mechanisms in different domains for time-series forecasting, revealing their equivalence under linear conditions, and proposes TDformer, a model that combines trend decomposition with domain-specific attention for improved accuracy.
Contribution
It provides a theoretical and empirical analysis of attention models across domains and introduces TDformer, a novel approach that enhances forecasting by integrating trend decomposition with domain-specific attention.
Findings
TDformer outperforms existing models on benchmark datasets.
Attention models in different domains are equivalent under linear conditions.
Decomposition-based attention improves long-term forecasting accuracy.
Abstract
Transformer-based models have gained large popularity and demonstrated promising results in long-term time-series forecasting in recent years. In addition to learning attention in time domain, recent works also explore learning attention in frequency domains (e.g., Fourier domain, wavelet domain), given that seasonal patterns can be better captured in these domains. In this work, we seek to understand the relationships between attention models in different time and frequency domains. Theoretically, we show that attention models in different domains are equivalent under linear conditions (i.e., linear kernel to attention scores). Empirically, we analyze how attention models of different domains show different behaviors through various synthetic experiments with seasonality, trend and noise, with emphasis on the role of softmax operation therein. Both these theoretical and empirical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
First De-Trend then Attend: Rethinking Attention for Time-Series Forecasting· youtube
Taxonomy
TopicsTime Series Analysis and Forecasting · Stock Market Forecasting Methods · Energy Load and Power Forecasting
MethodsSoftmax
