TL;DR
STM3 introduces a novel multiscale mixture model with disentangled experts and adaptive graph networks to improve long-term spatio-temporal prediction accuracy.
Contribution
It proposes a new architecture combining multiscale Mamba, disentangled mixture-of-experts, and adaptive graph causal networks for efficient long-term dependency modeling.
Findings
Achieves state-of-the-art results on 10 benchmarks.
Surpasses previous models by 7.1% in MAE on PEMSD8.
Demonstrates superior pattern disentanglement and routing smoothness.
Abstract
Recently, spatio-temporal time-series prediction has developed rapidly, yet existing deep learning methods struggle with learning complex long-term spatio-temporal dependencies efficiently. The long-term spatio-temporal dependency learning brings two new challenges: 1) The long-term temporal sequence naturally includes multiscale information, which is hard to extract efficiently; 2) The multiscale temporal information from different nodes is highly correlated and hard to model. To address these challenges, we propose Spatio-Temporal Mixture of Multiscale Mamba (STM3). STM3 integrates a Multiscale Mamba architecture within a novel Disentangled Mixture-of-Experts (DMoE) framework to capture diverse multiscale information efficiently, while utilizing an adaptive graph causal network to model complex spatial dependencies. To ensure robust representation learning, we introduce a stable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting
