Test-Time Learning of Causal Structure from Interventional Data

Wei Chen; Rui Ding; Bojun Huang; Yang Zhang; Qiang Fu; Yuxuan Liang; Han Shi; Dongmei Zhang

arXiv:2602.19131·cs.LG·February 24, 2026

Test-Time Learning of Causal Structure from Interventional Data

Wei Chen, Rui Ding, Bojun Huang, Yang Zhang, Qiang Fu, Yuxuan Liang, Han Shi, Dongmei Zhang

PDF

Open Access 3 Reviews

TL;DR

TICL is a new method that improves causal discovery from interventional data by generating instance-specific training data at test time and ensuring theoretical identifiability, leading to better generalization.

Contribution

We introduce TICL, combining test-time training with joint causal inference, to enhance causal discovery and intervention target detection in interventional data.

Findings

01

TICL outperforms existing methods on bnlearn benchmarks.

02

It effectively detects intervention targets.

03

It ensures theoretical identifiability of causal structures.

Abstract

Supervised causal learning has shown promise in causal discovery, yet it often struggles with generalization across diverse interventional settings, particularly when intervention targets are unknown. To address this, we propose TICL (Test-time Interventional Causal Learning), a novel method that synergizes Test-Time Training with Joint Causal Inference. Specifically, we design a self-augmentation strategy to generate instance-specific training data at test time, effectively avoiding distribution shifts. Furthermore, by integrating joint causal inference, we developed a PC-inspired two-phase supervised learning scheme, which effectively leverages self-augmented training data while ensuring theoretical identifiability. Extensive experiments on bnlearn benchmarks demonstrate TICL's superiority in multiple aspects of causal discovery and intervention target detection.

Peer Reviews

Decision·Submitted to ICLR 2025

Reviewer 01Rating 8Confidence 3

Strengths

I think this is a good paper. The authors take two existing techniques (JCI and SCL), describe both of them well, and provide an interesting combination that both feels intuitive and clever. The writing is, for the most part, clear throughout, with consistent terminology and notation. The experimental results, while only semi-synthetic (due to the lack of real-world interventions), are thorough and compelling, comparing against a solid range of competitive methods in multiple categories.

Weaknesses

I appreciate the attempt at a "When", "What", "How" framing in the introduction, but the framing of "We create training data after accessing the test data" is strange and over-complicates what you're actually doing. If I understand correctly, your approach is a two-stage process, using the input data $D$ to generate augmented semi-synthetic data, which is used to train a model that to classify $D$ in your second stage. I think part of the confusion is the use of the term "test data" here, sinc

Reviewer 02Rating 3Confidence 3

Strengths

The authors address a very interesting and important problem. The experiments are extensive, and the necessary background to understand the paper is provided.

Weaknesses

## Minor Weaknesses: * The introduction discusses many contributions, which might be a little distracting. * Figure 2's caption should mention what the numbers in the curly brackets refer to. * $X_2 \to X_1 \leftarrow \\{1\\}$. It is unclear why $\\{1\\}$ is considered a node here. * Assumptions such as Markovian, faithfulness, sufficiency, etc. should be defined and explained. * I-CPDAG is not discussed in detail, although it is used many times. * JCI considers data samples under interventions

Reviewer 03Rating 6Confidence 3

Strengths

The main contribution of the authors is their proposal of a novel JCI+SCL paradigm for causal discovery under unknown interventions from discrete data. Their TICL approach significantly advances the state of the art in discovering causal relations and interventional targets, achieving an average F1-score improvement across diverse real-world datasets. The authors introduce self-augmentation within SCL, enhancing training instances from the test data, and implement the IS-MCMC algorithm, which ef

Weaknesses

Although the paper is a well-contributed effort on empirical and experimental sides, showing significant advancements over the state of the art, I still believe the paper can improve with some theoretical justification on key steps involved in the proposed approach. For instance, the claim that the algorithm IS-MCMC samples $G$ from the posterior estimation $P(G|D)$ and that the data $D_i$ is compatible with $G_i$ is made. However, the authors do not justify how the stationary distribution of th

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Modeling and Causal Inference · Child and Animal Learning Development · Explainable Artificial Intelligence (XAI)