Causal Spatio-Temporal Prediction: An Effective and Efficient Multi-Modal Approach

Yuting Huang; Ziquan Fang; Zhihao Zeng; Lu Chen; Yunjun Gao

arXiv:2505.17637·cs.LG·October 29, 2025

Causal Spatio-Temporal Prediction: An Effective and Efficient Multi-Modal Approach

Yuting Huang, Ziquan Fang, Zhihao Zeng, Lu Chen, Yunjun Gao

PDF

TL;DR

This paper introduces E^2-CSTP, a novel multi-modal causal spatio-temporal prediction framework that effectively integrates data, uncovers true causal relations, and reduces computational costs, significantly outperforming existing methods.

Contribution

The paper presents a new causal multi-modal framework with cross-modal attention, dual-branch causal inference, and GCN-based encoding for efficient and accurate spatio-temporal prediction.

Findings

01

Achieves up to 9.66% accuracy improvement over state-of-the-art methods.

02

Reduces computational overhead by 17.37%-56.11%.

03

Effectively uncovers true causal dependencies in multi-modal data.

Abstract

Spatio-temporal prediction plays a crucial role in intelligent transportation, weather forecasting, and urban planning. While integrating multi-modal data has shown potential for enhancing prediction accuracy, key challenges persist: (i) inadequate fusion of multi-modal information, (ii) confounding factors that obscure causal relations, and (iii) high computational complexity of prediction models. To address these challenges, we propose E^2-CSTP, an Effective and Efficient Causal multi-modal Spatio-Temporal Prediction framework. E^2-CSTP leverages cross-modal attention and gating mechanisms to effectively integrate multi-modal data. Building on this, we design a dual-branch causal inference approach: the primary branch focuses on spatio-temporal prediction, while the auxiliary branch mitigates bias by modeling additional modalities and applying causal interventions to uncover true…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.