Yformer: U-Net Inspired Transformer Architecture for Far Horizon Time   Series Forecasting

Kiran Madhusudhanan (1); Johannes Burchert (1); Nghia Duong-Trung (2),; Stefan Born (2); Lars Schmidt-Thieme (1) ((1) University of Hildesheim; (2); Technische Universit\"at Berlin)

arXiv:2110.08255·cs.LG·October 24, 2022

Yformer: U-Net Inspired Transformer Architecture for Far Horizon Time Series Forecasting

Kiran Madhusudhanan (1), Johannes Burchert (1), Nghia Duong-Trung (2),, Stefan Born (2), Lars Schmidt-Thieme (1) ((1) University of Hildesheim, (2), Technische Universit\"at Berlin)

PDF

Open Access 1 Repo

TL;DR

Yformer introduces a U-Net inspired transformer architecture with a Y-shaped encoder-decoder design and sparse attention, significantly improving far horizon time series forecasting accuracy on benchmark datasets.

Contribution

The paper presents Yformer, a novel transformer model with a Y-shaped encoder-decoder structure that enhances long-range effect capture and output resolution in time series forecasting.

Findings

01

Achieves approximately 19.82% MSE improvement over state-of-the-art.

02

Attains around 13.62% MAE reduction compared to existing methods.

03

Demonstrates consistent performance gains on four benchmark datasets.

Abstract

Time series data is ubiquitous in research as well as in a wide variety of industrial applications. Effectively analyzing the available historical data and providing insights into the far future allows us to make effective decisions. Recent research has witnessed the superior performance of transformer-based architectures, especially in the regime of far horizon time series forecasting. However, the current state of the art sparse Transformer architectures fail to couple down- and upsampling procedures to produce outputs in a similar resolution as the input. We propose the Yformer model, based on a novel Y-shaped encoder-decoder architecture that (1) uses direct connection from the downscaled encoder layer to the corresponding upsampled decoder layer in a U-Net inspired architecture, (2) Combines the downscaling/upsampling with sparse attention to capture long-range effects, and (3)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

18kiran12/Yformer-Time-Series-Forecasting
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTime Series Analysis and Forecasting · Neural Networks and Applications · Stock Market Forecasting Methods

MethodsAttention Is All You Need · *Communicated@Fast*How Do I Communicate to Expedia? · Linear Layer · Cosine Annealing · Absolute Position Encodings · Weight Decay · Position-Wise Feed-Forward Layer · Softmax · Residual Connection · Linear Warmup With Cosine Annealing