SAMformer: Unlocking the Potential of Transformers in Time Series Forecasting with Sharpness-Aware Minimization and Channel-Wise Attention
Romain Ilbert, Ambroise Odonnat, Vasilii Feofanov, Aladin Virmaux,, Giuseppe Paolo, Themis Palpanas, Ievgen Redko

TL;DR
SAMformer introduces a lightweight transformer model optimized with sharpness-aware minimization, significantly improving multivariate long-term forecasting accuracy and generalization over existing methods.
Contribution
The paper presents a novel shallow transformer architecture combined with sharpness-aware optimization, addressing convergence issues and enhancing forecasting performance.
Findings
SAMformer outperforms current state-of-the-art methods on real-world datasets.
The model achieves comparable results to large foundation models with fewer parameters.
Sharpness-aware optimization improves transformer convergence and generalization.
Abstract
Transformer-based architectures achieved breakthrough performance in natural language processing and computer vision, yet they remain inferior to simpler linear baselines in multivariate long-term forecasting. To better understand this phenomenon, we start by studying a toy linear forecasting problem for which we show that transformers are incapable of converging to their true solution despite their high expressive power. We further identify the attention of transformers as being responsible for this low generalization capacity. Building upon this insight, we propose a shallow lightweight transformer model that successfully escapes bad local minima when optimized with sharpness-aware optimization. We empirically demonstrate that this result extends to all commonly used real-world multivariate time series datasets. In particular, SAMformer surpasses current state-of-the-art methods and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Stock Market Forecasting Methods · Forecasting Techniques and Applications
