ChunkFormer: Learning Long Time Series with Multi-stage Chunked   Transformer

Yue Ju; Alka Isac; Yimin Nie

arXiv:2112.15087·cs.LG·January 3, 2022

ChunkFormer: Learning Long Time Series with Multi-stage Chunked Transformer

Yue Ju, Alka Isac, Yimin Nie

PDF

Open Access

TL;DR

ChunkFormer is a novel Transformer-based architecture designed to efficiently analyze long time series data by progressively chunking sequences, capturing both local and global information while reducing computational resources.

Contribution

The paper introduces ChunkFormer, a multi-stage chunking approach that enhances Transformer models for long sequence analysis without increasing resource consumption.

Findings

01

Outperforms existing Transformer models on various long sequence tasks.

02

Effectively captures local seasonality and fluctuations in long time series.

03

Reduces computational resources needed for training long sequences.

Abstract

The analysis of long sequence data remains challenging in many real-world applications. We propose a novel architecture, ChunkFormer, that improves the existing Transformer framework to handle the challenges while dealing with long time series. Original Transformer-based models adopt an attention mechanism to discover global information along a sequence to leverage the contextual data. Long sequential data traps local information such as seasonality and fluctuations in short data sequences. In addition, the original Transformer consumes more resources by carrying the entire attention matrix during the training course. To overcome these challenges, ChunkFormer splits the long sequences into smaller sequence chunks for the attention calculation, progressively applying different chunk sizes in each stage. In this way, the proposed model gradually learns both local and global information…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTime Series Analysis and Forecasting · Data Stream Mining Techniques · Anomaly Detection Techniques and Applications

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Label Smoothing · Absolute Position Encodings · Residual Connection · Softmax · Adam · Position-Wise Feed-Forward Layer · Dense Connections