Yformer: U-Net Inspired Transformer Architecture for Far Horizon Time Series Forecasting
Kiran Madhusudhanan (1), Johannes Burchert (1), Nghia Duong-Trung (2),, Stefan Born (2), Lars Schmidt-Thieme (1) ((1) University of Hildesheim, (2), Technische Universit\"at Berlin)

TL;DR
Yformer introduces a U-Net inspired transformer architecture with a Y-shaped encoder-decoder design and sparse attention, significantly improving far horizon time series forecasting accuracy on benchmark datasets.
Contribution
The paper presents Yformer, a novel transformer model with a Y-shaped encoder-decoder structure that enhances long-range effect capture and output resolution in time series forecasting.
Findings
Achieves approximately 19.82% MSE improvement over state-of-the-art.
Attains around 13.62% MAE reduction compared to existing methods.
Demonstrates consistent performance gains on four benchmark datasets.
Abstract
Time series data is ubiquitous in research as well as in a wide variety of industrial applications. Effectively analyzing the available historical data and providing insights into the far future allows us to make effective decisions. Recent research has witnessed the superior performance of transformer-based architectures, especially in the regime of far horizon time series forecasting. However, the current state of the art sparse Transformer architectures fail to couple down- and upsampling procedures to produce outputs in a similar resolution as the input. We propose the Yformer model, based on a novel Y-shaped encoder-decoder architecture that (1) uses direct connection from the downscaled encoder layer to the corresponding upsampled decoder layer in a U-Net inspired architecture, (2) Combines the downscaling/upsampling with sparse attention to capture long-range effects, and (3)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Neural Networks and Applications · Stock Market Forecasting Methods
MethodsAttention Is All You Need · *Communicated@Fast*How Do I Communicate to Expedia? · Linear Layer · Cosine Annealing · Absolute Position Encodings · Weight Decay · Position-Wise Feed-Forward Layer · Softmax · Residual Connection · Linear Warmup With Cosine Annealing
