GBT: Two-stage transformer framework for non-stationary time series   forecasting

Li Shen; Yuning Wei; Yangzhu Wang

arXiv:2307.08302·cs.LG·July 18, 2023

GBT: Two-stage transformer framework for non-stationary time series forecasting

Li Shen, Yuning Wei, Yangzhu Wang

PDF

Open Access 1 Repo

TL;DR

This paper introduces GBT, a two-stage Transformer framework for non-stationary time series forecasting that improves initialization and reduces overfitting, outperforming state-of-the-art models on multiple benchmarks.

Contribution

Proposes GBT, a novel two-stage Transformer framework with Good Beginning and Error Score Modification to enhance forecasting of non-stationary time series.

Findings

01

GBT outperforms SOTA TSFTs and other models on seven benchmarks.

02

GBT achieves better accuracy with less computational complexity.

03

The framework is compatible with existing models to improve their performance.

Abstract

This paper shows that time series forecasting Transformer (TSFT) suffers from severe over-fitting problem caused by improper initialization method of unknown decoder inputs, esp. when handling non-stationary time series. Based on this observation, we propose GBT, a novel two-stage Transformer framework with Good Beginning. It decouples the prediction process of TSFT into two stages, including Auto-Regression stage and Self-Regression stage to tackle the problem of different statistical properties between input and prediction sequences.Prediction results of Auto-Regression stage serve as a Good Beginning, i.e., a better initialization for inputs of Self-Regression stage. We also propose Error Score Modification module to further enhance the forecasting capability of the Self-Regression stage in GBT. Extensive experiments on seven benchmark datasets demonstrate that GBT outperforms SOTA…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

origamisl/gbt
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTime Series Analysis and Forecasting · Neural Networks and Applications · Stock Market Forecasting Methods

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Byte Pair Encoding · Position-Wise Feed-Forward Layer · Label Smoothing · Residual Connection · Absolute Position Encodings · Adam · Layer Normalization