Causal Representation Meets Stochastic Modeling under Generic Geometry

Jiaxu Ren; Yixin Wang; Biwei Huang

arXiv:2602.05033·cs.LG·February 6, 2026

Causal Representation Meets Stochastic Modeling under Generic Geometry

Jiaxu Ren, Yixin Wang, Biwei Huang

PDF

Open Access 3 Reviews

TL;DR

This paper introduces a method for learning causal representations from continuous-time stochastic processes, enabling scientific insights in fields like genomics and neuroscience.

Contribution

It develops an identifiable variational autoencoder framework for continuous-time stochastic processes, addressing a gap in existing causal representation learning methods.

Findings

01

MUTATE effectively infers stochastic dynamics in simulated data.

02

It uncovers causal mechanisms in genomics and neuroscience.

03

The approach demonstrates practical utility in real-world scientific questions.

Abstract

Learning meaningful causal representations from observations has emerged as a crucial task for facilitating machine learning applications and driving scientific discoveries in fields such as climate science, biology, and physics. This process involves disentangling high-level latent variables and their causal relationships from low-level observations. Previous work in this area that achieves identifiability typically focuses on cases where the observations are either i.i.d. or follow a latent discrete-time process. Nevertheless, many real-world settings require identifying latent variables that are continuous-time stochastic processes (e.g., multivariate point processes). To this end, we develop identifiable causal representation learning for continuous-time latent stochastic point processes. We study its identifiability by analyzing the geometry of the parameter space. Furthermore, we…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 6Confidence 2

Strengths

- Novelty: CRL for continuous-time latent Hawkes processes under generic (non-invertible) mixing is timely and underexplored; the weakly-convergent class neatly addresses discrete sampling vs continuous dynamics. - The time-adaptive transition along with PSD decomposition to enforce latent whiteness provides connects theory to practice; the ELBO is explicitly provided. - Empirical results show higher MCC on multiple kernel regimes vs temporal/non-temporal baselines.

Weaknesses

- Assumption 2 (“zero-dimensional ideal) seems hard to verify. - Lemmas 1–2 show convergence to a latent class but no explicit error rates are provided.

Reviewer 02Rating 6Confidence 3

Strengths

1. Most identifiability results in CRL assume discrete‑time latents and invertible mixing. Treating continuous‑time point processes with generic mixing is novel and practically relevant. 2. Although i did not look into the proof details, the theorems are intuitive and clear in a geometry view, considering identifibility as zero‑dimensionality of solution set of the system. 3. The model components are well related to the theorem.

Weaknesses

Major Concerns 1. Only simulation results are presented. Considering the broad motivation of the work, including real dataset with event sequence from a continuous process would greatly strengthen the paper. 2. Adding a symbolic summary box would enhance readability, as the paper currently contains a large number of symbols such as \bigstar and \otimes. 3. Assumptions A2.2 and A3.2 are not clear to me, and under what circumstance should this assumption be true? It will be better to have some il

Reviewer 03Rating 6Confidence 3

Strengths

1. The paper expands the scope of research on causal representation learning, extending it from traditional i.i.d. and discrete-time scenarios to continuous-time latent stochastic point processes. 2. The authors provide the theoretical analysis that the underlying causal structure and variables can still be uniquely identified under nonlinear mixing functions.

Weaknesses

1. Although the theoretical framework relies on high-order cumulant tensor decomposition to achieve identifiability of causal representations, the practical solution shifts to a VAE-based generative modeling approach. This choice may lead to a disconnect between theory and practice, especially when the performance of the VAE depends on specific model assumptions or data distributions, potentially failing to fully reflect the advantages of the theoretical framework. 2. Theorem 2 relies on first-o

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Bayesian Modeling and Causal Inference · Gaussian Processes and Bayesian Inference