Sequence Complementor: Complementing Transformers For Time Series   Forecasting with Learnable Sequences

Xiwen Chen; Peijie Qiu; Wenhui Zhu; Huayu Li; Hao Wang; Aristeidis; Sotiras; Yalin Wang; Abolfazl Razi

arXiv:2501.02735·cs.LG·January 7, 2025

Sequence Complementor: Complementing Transformers For Time Series Forecasting with Learnable Sequences

Xiwen Chen, Peijie Qiu, Wenhui Zhu, Huayu Li, Hao Wang, Aristeidis, Sotiras, Yalin Wang, Abolfazl Razi

PDF

Open Access

TL;DR

This paper introduces learnable Sequence Complementors to enhance transformer-based time series forecasting by improving sequence representation diversity, leading to better performance in both long-term and short-term predictions.

Contribution

It proposes a novel attention mechanism with learnable Sequence Complementors and a diversification loss, improving transformer performance through more diverse sequence representations.

Findings

01

Sequence Complementors improve forecasting accuracy.

02

Enhanced diversity of sequence representations correlates with lower error.

03

Method outperforms recent state-of-the-art models in experiments.

Abstract

Since its introduction, the transformer has shifted the development trajectory away from traditional models (e.g., RNN, MLP) in time series forecasting, which is attributed to its ability to capture global dependencies within temporal tokens. Follow-up studies have largely involved altering the tokenization and self-attention modules to better adapt Transformers for addressing special challenges like non-stationarity, channel-wise dependency, and variable correlation in time series. However, we found that the expressive capability of sequence representation is a key factor influencing Transformer performance in time forecasting after investigating several representative methods, where there is an almost linear relationship between sequence representation entropy and mean square error, with more diverse representations performing better. In this paper, we propose a novel attention…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTime Series Analysis and Forecasting · Stock Market Forecasting Methods · Neural Networks and Applications

MethodsAttention Is All You Need · Byte Pair Encoding · Dense Connections · Absolute Position Encodings · Dropout · Linear Layer · Softmax · Adam · Residual Connection · Multi-Head Attention