Causal Structure Learning in Hawkes Processes with Complex Latent Confounder Networks
Songyao Jin, Biwei Huang

TL;DR
This paper introduces a novel method for uncovering causal structures in multivariate Hawkes processes with hidden subprocesses, using a discrete-time approximation and an iterative algorithm to identify both observed and latent causal influences.
Contribution
It provides the first necessary and sufficient conditions for identifying latent subprocesses and causal influences in Hawkes processes, along with an effective two-phase algorithm for structure learning.
Findings
Successfully recovers causal structures in synthetic data
Effectively uncovers latent subprocesses in real-world datasets
Outperforms existing methods in causal inference accuracy
Abstract
Multivariate Hawkes process provides a powerful framework for modeling temporal dependencies and event-driven interactions in complex systems. While existing methods primarily focus on uncovering causal structures among observed subprocesses, real-world systems are often only partially observed, with latent subprocesses posing significant challenges. In this paper, we show that continuous-time event sequences can be represented by a discrete-time causal model as the time interval shrinks, and we leverage this insight to establish necessary and sufficient conditions for identifying latent subprocesses and the causal influences. Accordingly, we propose a two-phase iterative algorithm that alternates between inferring causal relationships among discovered subprocesses and uncovering new latent subprocesses, guided by path-based conditions that guarantee identifiability. Experiments on both…
Peer Reviews
Decision·ICLR 2026 Oral
The paper addresses an important and relevant problem, i.e., causal discovery in multivariate Hawkes processes (MHPs) with hidden confounders. The theoretical contributions provide valuable insights into causal discovery without assuming causal sufficiency in MHPs and represent an important step toward advancing research in this area.
The main result builds on representing an MHP as a linear autoregressive model through discretization. However, according to Theorem 4.1, this result holds only when the discretization parameter (\Delta) tends to zero. In practice, for small but finite (\Delta), this leads to model mismatch, which can also be observed in the sensitivity analysis with respect to (\Delta) in Table 1. Moreover, all identifiability results are derived under the assumption that the linear representation holds, i.e.,
1. One of the paper's main strengths is its strong theoretical foundation. Theorem 4.1 is a powerful result, which innovatively establishes a connection between continuous-time Hawkes processes and a discrete-time linear autoregressive representation. Moreover, the *Definition 4.4 + Proposition 4.5 + Theorems 4.7/4.8* that links symmetric path structures to observable rank deficiencies is also original and enables finding latent confounders without prior information of the existence or number of
1. Strong structural assumptions. Definition 4.4 formalizes the Symmetric Acyclic Path Situation (the observed effects being connected to the latent via paths of equal length and acyclic intermediate latents), which is a somewhat special topology. However, in complex systems intermediate latents can have varying path lengths or additional cross-links, which would break the condition and make that latent unidentifiable by the method. 2. While the paper is theoretically rigorous, it is also extre
Compared to the previous NeurIPS 2025 submission that I had reviewed, the manuscript has substantially improved in clarity of the claims and structure of the paper. Most importantly, the iterative algorithm is now explicitly defined, assumptions and identifiability conditions are more carefully motivated, and the connection to prior work—including an additional LPCMCI baseline in experiments, rank-based latent structure discovery methods, and INAR processes—has been expanded.
1. Novelty of Theorem 4.1 can still be debated. While the authors’ expanded discussion distinguishes their formulation from prior binning-based estimation approaches, the contribution could still benefit from more explicit formal comparison (e.g., showing in what sense their linear representation differs from INAR-based or EM-based formulations beyond the absence of likelihood modeling). 2. Motivation benefits from more discussion. In much of the classical literature on the broader causal disco
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPoint processes and geometric inequalities
