Beyond All-to-All: Causal-Aligned Transformer with Dynamic Structure Learning for Multivariate Time Series Forecasting
Xingyu Zhang, Hanyun Du, Zeen Song, Siyu Zhao, Changwen Zheng, Wenwen Qiang

TL;DR
This paper introduces a novel causal-aligned transformer model for multivariate time series forecasting that explicitly models variable-specific causal influences, improving interpretability and robustness over traditional all-to-all methods.
Contribution
It proposes a new all-to-one forecasting paradigm with a causal decomposition transformer that dynamically learns and corrects causal structures during training.
Findings
Outperforms existing methods on benchmark datasets.
Effectively identifies causal relationships among variables.
Improves robustness by mitigating collider bias.
Abstract
Most existing multivariate time series forecasting methods adopt an all-to-all paradigm that feeds all variable histories into a unified model to predict their future values without distinguishing their individual roles. However, this undifferentiated paradigm makes it difficult to identify variable-specific causal influences and often entangles causally relevant information with spurious correlations. To address this limitation, we propose an all-to-one forecasting paradigm that predicts each target variable separately. Specifically, we first construct a Structural Causal Model from observational data and then, for each target variable, we partition the historical sequence into four subsegments according to the inferred causal structure: endogenous, direct causal, collider causal, and spurious correlation. Furthermore, we propose the Causal Decomposition Transformer (CDT), which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Forecasting Techniques and Applications · Stock Market Forecasting Methods
MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Multi-Head Attention · Dense Connections · ADaptive gradient method with the OPTimal convergence rate · Softmax · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Residual Connection
