DRFormer: Multi-Scale Transformer Utilizing Diverse Receptive Fields for   Long Time-Series Forecasting

Ruixin Ding; Yuqi Chen; Yu-Ting Lan; Wei Zhang

arXiv:2408.02279·cs.LG·August 6, 2024

DRFormer: Multi-Scale Transformer Utilizing Diverse Receptive Fields for Long Time-Series Forecasting

Ruixin Ding, Yuqi Chen, Yu-Ting Lan, Wei Zhang

PDF

1 Repo

TL;DR

DRFormer is a multi-scale Transformer model with dynamic tokenization and position encoding, designed to improve long-term time series forecasting by capturing diverse temporal features across multiple resolutions.

Contribution

It introduces a dynamic sparse learning-based tokenizer, multi-scale Transformer architecture, and group-aware rotary position encoding for enhanced multi-resolution time series modeling.

Findings

01

Outperforms existing methods on real-world datasets

02

Effectively captures multi-scale temporal features

03

Demonstrates superior long-term forecasting accuracy

Abstract

Long-term time series forecasting (LTSF) has been widely applied in finance, traffic prediction, and other domains. Recently, patch-based transformers have emerged as a promising approach, segmenting data into sub-level patches that serve as input tokens. However, existing methods mostly rely on predetermined patch lengths, necessitating expert knowledge and posing challenges in capturing diverse characteristics across various scales. Moreover, time series data exhibit diverse variations and fluctuations across different temporal scales, which traditional approaches struggle to model effectively. In this paper, we propose a dynamic tokenizer with a dynamic sparse learning algorithm to capture diverse receptive fields and sparse patterns of time series data. In order to build hierarchical receptive fields, we develop a multi-scale Transformer model, coupled with multi-scale sequence…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ruixindingecnu/drformer
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsAttention Is All You Need · Linear Layer · Residual Connection · Multi-Head Attention · Position-Wise Feed-Forward Layer · Adam · Byte Pair Encoding · Softmax · Absolute Position Encodings · Dense Connections