DiTS: Multimodal Diffusion Transformers Are Time Series Forecasters
Haoran Zhang, Haixuan Liu, Yong Liu, Yunzhong Qiu, Yuxuan Wang, Jianmin Wang, Mingsheng Long

TL;DR
DiTS introduces a multimodal diffusion transformer architecture for time series forecasting that effectively models inter- and intra-variate dependencies, achieving state-of-the-art results in probabilistic forecasting tasks.
Contribution
The paper proposes DiTS, a novel multimodal diffusion transformer with dual-stream attention modules for better dependency modeling in multivariate time series forecasting.
Findings
DiTS outperforms existing models on multiple benchmarks.
It effectively models dependencies even without future exogenous data.
Achieves state-of-the-art probabilistic forecasting performance.
Abstract
While generative modeling on time series facilitates more capable and flexible probabilistic forecasting, existing generative time series models do not address the multi-dimensional properties of time series data well. The prevalent architecture of Diffusion Transformers (DiT), which relies on simplistic conditioning controls and a single-stream Transformer backbone, tends to underutilize cross-variate dependencies in covariate-aware forecasting. Inspired by Multimodal Diffusion Transformers that integrate textual guidance into video generation, we propose Diffusion Transformers for Time Series (DiTS), a general-purpose architecture that frames endogenous and exogenous variates as distinct modalities. To better capture both inter-variate and intra-variate dependencies, we design a dual-stream Transformer block tailored for time-series data, comprising a Time Attention module for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Machine Learning in Healthcare · Generative Adversarial Networks and Image Synthesis
