ChunkFormer: Learning Long Time Series with Multi-stage Chunked Transformer
Yue Ju, Alka Isac, Yimin Nie

TL;DR
ChunkFormer is a novel Transformer-based architecture designed to efficiently analyze long time series data by progressively chunking sequences, capturing both local and global information while reducing computational resources.
Contribution
The paper introduces ChunkFormer, a multi-stage chunking approach that enhances Transformer models for long sequence analysis without increasing resource consumption.
Findings
Outperforms existing Transformer models on various long sequence tasks.
Effectively captures local seasonality and fluctuations in long time series.
Reduces computational resources needed for training long sequences.
Abstract
The analysis of long sequence data remains challenging in many real-world applications. We propose a novel architecture, ChunkFormer, that improves the existing Transformer framework to handle the challenges while dealing with long time series. Original Transformer-based models adopt an attention mechanism to discover global information along a sequence to leverage the contextual data. Long sequential data traps local information such as seasonality and fluctuations in short data sequences. In addition, the original Transformer consumes more resources by carrying the entire attention matrix during the training course. To overcome these challenges, ChunkFormer splits the long sequences into smaller sequence chunks for the attention calculation, progressively applying different chunk sizes in each stage. In this way, the proposed model gradually learns both local and global information…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Data Stream Mining Techniques · Anomaly Detection Techniques and Applications
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Label Smoothing · Absolute Position Encodings · Residual Connection · Softmax · Adam · Position-Wise Feed-Forward Layer · Dense Connections
