Two Steps Forward and One Behind: Rethinking Time Series Forecasting with Deep Learning
Riccardo Ughi, Eugenio Lomurno, Matteo Matteucci

TL;DR
This paper critically evaluates Transformer-based models for time series forecasting, revealing their limitations, and introduces simpler, more effective models that outperform complex ones, emphasizing the importance of baselines and cautious trend-following.
Contribution
It demonstrates that simplifying Transformer models improves performance, proposes shallow models without attention for long-term forecasting, and advocates for rigorous baselines and critical assessment of trends.
Findings
Simplified Transformer models outperform complex ones.
Shallow models without attention compete with state-of-the-art long-term forecasting.
Using simple baselines is essential to verify model effectiveness.
Abstract
The Transformer is a highly successful deep learning model that has revolutionised the world of artificial neural networks, first in natural language processing and later in computer vision. This model is based on the attention mechanism and is able to capture complex semantic relationships between a variety of patterns present in the input data. Precisely because of these characteristics, the Transformer has recently been exploited for time series forecasting problems, assuming a natural adaptability to the domain of continuous numerical series. Despite the acclaimed results in the literature, some works have raised doubts about the robustness and effectiveness of this approach. In this paper, we further investigate the effectiveness of Transformer-based models applied to the domain of time series forecasting, demonstrate their limitations, and propose a set of alternative models that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Forecasting Techniques and Applications · Stock Market Forecasting Methods
MethodsAttention Is All You Need · Softmax · Absolute Position Encodings · Linear Layer · Byte Pair Encoding · Layer Normalization · Residual Connection · Dense Connections · Multi-Head Attention · Dropout
