Timer: Generative Pre-trained Transformers Are Large Time Series Models
Yong Liu, Haoran Zhang, Chenyu Li, Xiangdong Huang, Jianmin Wang,, Mingsheng Long

TL;DR
This paper introduces Timer, a large pre-trained transformer model for time series analysis, capable of handling forecasting, imputation, and anomaly detection through a unified generative approach, leveraging large-scale datasets and GPT-style architecture.
Contribution
It pioneers the development of large time series models (LTSMs) using pre-training on massive datasets and a unified generative framework, extending GPT-style models to time series tasks.
Findings
Timer demonstrates strong performance across multiple time series tasks.
Pre-training on large datasets improves model generalization.
Unified generative approach simplifies multi-task learning.
Abstract
Deep learning has contributed remarkably to the advancement of time series analysis. Still, deep models can encounter performance bottlenecks in real-world data-scarce scenarios, which can be concealed due to the performance saturation with small models on current benchmarks. Meanwhile, large models have demonstrated great powers in these scenarios through large-scale pre-training. Continuous progress has been achieved with the emergence of large language models, exhibiting unprecedented abilities such as few-shot generalization, scalability, and task generality, which are however absent in small deep models. To change the status quo of training scenario-specific small models from scratch, this paper aims at the early development of large time series models (LTSM). During pre-training, we curate large-scale datasets with up to 1 billion time points, unify heterogeneous time series into…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting
MethodsAttention Is All You Need · Absolute Position Encodings · Linear Layer · Byte Pair Encoding · Multi-Head Attention · Adam · Residual Connection · Layer Normalization · Dense Connections · Position-Wise Feed-Forward Layer
