Mask2Cause: Causal Discovery via Adjacency Constrained Causal Attention

Omar Muhammad; Pasupuleti Dhruv Shivkant; Deepak N. Subramani

arXiv:2605.07280·cs.LG·May 11, 2026

Mask2Cause: Causal Discovery via Adjacency Constrained Causal Attention

Omar Muhammad, Pasupuleti Dhruv Shivkant, Deepak N. Subramani

PDF

TL;DR

Mask2Cause is a novel deep learning framework that directly discovers causal graphs in time series during forecasting, improving accuracy and reducing model complexity.

Contribution

It introduces an end-to-end causal discovery method with adjacency-constrained attention, outperforming existing neural approaches in diverse benchmarks.

Findings

01

Achieves state-of-the-art causal discovery accuracy.

02

Reduces forecasting model parameters by over 70%.

03

Performs well on synthetic and biological data.

Abstract

Leveraging deep learning for causal discovery in time series remains challenging because existing neural methods predominantly rely on component-wise architectures that fail to capture shared system dynamics or employ decoupled post-hoc graph extraction that risks overfitting to spurious correlations. We propose $Mask2Cause$ , an end-to-end framework that recovers the underlying causal graph directly during the forecasting forward pass. Our approach introduces an Inverted Variable Embedding and an Adjacency-Constrained Masked Attention mechanism, trained with homoscedastic or heteroscedastic objectives to capture causal influences in both mean and variance. Empirical results on diverse benchmarks, from synthetic chaotic dynamics to realistic biological simulations, demonstrate state-of-the-art causal discovery with significantly reduced parameter complexity compared to standard…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.