TimeMixer++: A General Time Series Pattern Machine for Universal Predictive Analysis
Shiyu Wang, Jiawei Li, Xiaoming Shi, Zhou Ye, Baichuan Mo, Wenze Lin, Shengtong Ju, Zhixuan Chu, Ming Jin

TL;DR
TimeMixer++ introduces a versatile model that captures multi-scale temporal and frequency patterns in time series data, achieving state-of-the-art results across diverse predictive tasks.
Contribution
It presents a novel universal time series pattern machine with multi-resolution analysis and adaptive pattern extraction, outperforming existing models in various tasks.
Findings
Achieves state-of-the-art performance on 8 time series tasks.
Effectively captures seasonal and trend patterns across multiple scales.
Outperforms both general-purpose and task-specific models.
Abstract
Time series analysis plays a critical role in numerous applications, supporting tasks such as forecasting, classification, anomaly detection, and imputation. In this work, we present the time series pattern machine (TSPM), a model designed to excel in a broad range of time series tasks through powerful representation and pattern extraction capabilities. Traditional time series models often struggle to capture universal patterns, limiting their effectiveness across diverse tasks. To address this, we define multiple scales in the time domain and various resolutions in the frequency domain, employing various mixing strategies to extract intricate, task-adaptive time series patterns. Specifically, we introduce a general-purpose TSPM that processes multi-scale time series using (1) multi-resolution time imaging (MRTI), (2) time image decomposition (TID), (3) multi-scale mixing (MCM), and (4)…
Peer Reviews
Decision·ICLR 2025 Oral
S1. The proposed model captures both short- and long-term dependencies by transforming time series data into multi-resolution images, enabling the analysis of complex temporal and frequency-domain patterns that challenge traditional models. The authors validate this with experimental results showing the new architecture outperforms SOTA models on most standard benchmarks. The ablation study helps validate the importance of the individual parts of the architecture – the channel mixing, image deco
W1. There is little exploration of scaling of model size, which would be an interesting avenue for validating the model architecture in a zero shot setting. The current zero-shot experiments are primarily in-domain and not cross-task.
1. the authors introduce a robust framework TimeMixer++ that leverages multi-resolution time imaging, multi-scale mixing, and dual-axis attention to enhance general time series analysis. They present SOTA results on four different tasks. 2. the integration of both multi-scale and multi-resolution mixing strategies for adaptive pattern extraction demonstrates innovation. 3. the manuscript and appendix are well-prepared, but the authors have not yet released the promised code.
1. the fonts in the figures should be enlarged for better readability. For example, in Figure 1 (right), the label "Benchmarking model performance across representation analysis in four tasks" appears blurred. Additionally, consider using a single set of legends for all four tasks to enhance clarity. 2. the source code repository has not released for reproducing, i will consider raising the score if the released repository and the consistency of the results. 3. more detail on how it compares to
The methods used in the paper (e.g., time imaging and image decomposition) are very interesting. The evaluation is comprehensive: the authors discuss long and short-term forecasting, zero-short forecasting, classification, and anomaly detection.
In terms of the forecasting results shown in Tables 3 and 4, the performance gain is negligible, and such minor improved performance certainly can be attributed to the parameter tuning, e.g., a well-tuned parameter settings for TimeMixer++ while a weak parameter settings for other competing methods. The paper barely offers insights both theoretically and experimentally. The theoretical understanding of the improvement as well as its time imaging and multi-resolution mixing is lacking, mostly
Code & Models
Videos
Taxonomy
TopicsTime Series Analysis and Forecasting · Neural Networks and Applications
MethodsSoftmax · Attention Is All You Need
