PatchMixer: A Patch-Mixing Architecture for Long-Term Time Series   Forecasting

Zeying Gong; Yujin Tang; Junwei Liang

arXiv:2310.00655·cs.LG·October 15, 2024·25 cites

PatchMixer: A Patch-Mixing Architecture for Long-Term Time Series Forecasting

Zeying Gong, Yujin Tang, Junwei Liang

PDF

Open Access 2 Repos

TL;DR

PatchMixer is a CNN-based architecture that preserves temporal information in long-term time series forecasting, outperforming Transformer-based models in accuracy and speed by using permutation-variant convolutions and dual forecasting heads.

Contribution

It introduces a novel permutation-variant CNN architecture with depthwise separable convolutions and dual heads for improved long-term time series forecasting.

Findings

01

Achieves 3.9% and 21.2% relative improvements over state-of-the-art methods.

02

Runs 2-3 times faster than the most advanced models.

03

Effective on seven benchmark datasets.

Abstract

Although the Transformer has been the dominant architecture for time series forecasting tasks in recent years, a fundamental challenge remains: the permutation-invariant self-attention mechanism within Transformers leads to a loss of temporal information. To tackle these challenges, we propose PatchMixer, a novel CNN-based model. It introduces a permutation-variant convolutional structure to preserve temporal information. Diverging from conventional CNNs in this field, which often employ multiple scales or numerous branches, our method relies exclusively on depthwise separable convolutions. This allows us to extract both local features and global correlations using a single-scale architecture. Furthermore, we employ dual forecasting heads encompassing linear and nonlinear components to better model future curve trends and details. Our experimental results on seven time-series…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTime Series Analysis and Forecasting · Stock Market Forecasting Methods · Forecasting Techniques and Applications

MethodsMulti-Head Attention · Attention Is All You Need · Dense Connections · Linear Layer · Label Smoothing · Absolute Position Encodings · Adam · Residual Connection · Layer Normalization · Softmax