U-shaped Transformer: Retain High Frequency Context in Time Series   Analysis

Qingkui Chen; Yiqin Zhang

arXiv:2307.09019·cs.LG·October 10, 2023

U-shaped Transformer: Retain High Frequency Context in Time Series Analysis

Qingkui Chen, Yiqin Zhang

PDF

Open Access

TL;DR

This paper introduces a U-shaped Transformer model that preserves high-frequency information in time series prediction by combining transformer and MLP advantages, leading to improved performance across datasets.

Contribution

It proposes a novel U-shaped Transformer architecture with skip-layer connections and multi-scale feature extraction for better time series analysis.

Findings

01

Achieves state-of-the-art results on multiple datasets.

02

Effectively retains high-frequency information in predictions.

03

Operates with relatively low computational cost.

Abstract

Time series prediction plays a crucial role in various industrial fields. In recent years, neural networks with a transformer backbone have achieved remarkable success in many domains, including computer vision and NLP. In time series analysis domain, some studies have suggested that even the simplest MLP networks outperform advanced transformer-based networks on time series forecast tasks. However, we believe these findings indicate there to be low-rank properties in time series sequences. In this paper, we consider the low-pass characteristics of transformers and try to incorporate the advantages of MLP. We adopt skip-layer connections inspired by Unet into traditional transformer backbone, thus preserving high-frequency context from input to output, namely U-shaped Transformer. We introduce patch merge and split operation to extract features with different scales and use larger…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTime Series Analysis and Forecasting · Neural Networks and Applications · Anomaly Detection Techniques and Applications

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Byte Pair Encoding · Position-Wise Feed-Forward Layer · Residual Connection · Absolute Position Encodings · Adam · Layer Normalization · Label Smoothing