STM3: Mixture of Multiscale Mamba for Long-Term Spatio-Temporal Time-Series Prediction

Haolong Chen; Liang Zhang; Zhengyuan Xin; Guangxu Zhu

arXiv:2508.12247·cs.LG·May 21, 2026

STM3: Mixture of Multiscale Mamba for Long-Term Spatio-Temporal Time-Series Prediction

Haolong Chen, Liang Zhang, Zhengyuan Xin, Guangxu Zhu

PDF

1 Repo

TL;DR

STM3 introduces a novel multiscale mixture model with disentangled experts and adaptive graph networks to improve long-term spatio-temporal prediction accuracy.

Contribution

It proposes a new architecture combining multiscale Mamba, disentangled mixture-of-experts, and adaptive graph causal networks for efficient long-term dependency modeling.

Findings

01

Achieves state-of-the-art results on 10 benchmarks.

02

Surpasses previous models by 7.1% in MAE on PEMSD8.

03

Demonstrates superior pattern disentanglement and routing smoothness.

Abstract

Recently, spatio-temporal time-series prediction has developed rapidly, yet existing deep learning methods struggle with learning complex long-term spatio-temporal dependencies efficiently. The long-term spatio-temporal dependency learning brings two new challenges: 1) The long-term temporal sequence naturally includes multiscale information, which is hard to extract efficiently; 2) The multiscale temporal information from different nodes is highly correlated and hard to model. To address these challenges, we propose Spatio-Temporal Mixture of Multiscale Mamba (STM3). STM3 integrates a Multiscale Mamba architecture within a novel Disentangled Mixture-of-Experts (DMoE) framework to capture diverse multiscale information efficiently, while utilizing an adaptive graph causal network to model complex spatial dependencies. To ensure robust representation learning, we introduce a stable…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

IfReasonable/STM3_KDD26
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTime Series Analysis and Forecasting