Stochastic Neural Networks for Causal Inference with Missing Confounders

Yaxin Fang; Faming Liang

arXiv:2603.01230·stat.CO·March 3, 2026

Stochastic Neural Networks for Causal Inference with Missing Confounders

Yaxin Fang, Faming Liang

PDF

Open Access 3 Reviews

TL;DR

This paper introduces CI-StoNet, a stochastic neural network approach for causal inference with unmeasured confounders, providing model identification guarantees and effective estimation in complex observational data scenarios.

Contribution

It proposes a novel neural network-based method with theoretical guarantees for identifying and estimating causal effects despite unmeasured confounding.

Findings

01

Accurate causal effect estimation on simulated data

02

Effective handling of multiple causes and proxy variables

03

Framework extends to complex causal structures

Abstract

Unmeasured confounding is a fundamental obstacle to causal inference from observational data. Latent-variable methods address this challenge by imputing unobserved confounders, yet many lack explicit model-based identification guarantees and are difficult to extend to richer causal structures. We propose Confounder Imputation with Stochastic Neural Networks (CI-StoNet), which parameterizes the conditional structure of a causal directed acyclic graph using a stochastic neural network and imputes latent confounders via adaptive stochastic-gradient Hamiltonian Monte Carlo. Under SUTVA and overlap, and assuming that the structural components of the data-generating process are well approximated by a capacity-controlled sparse deep neural network class, we establish model identification and consistent estimation of the mean potential outcome under a fixed intervention within this class.…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 4Confidence 3

Strengths

- The paper offers a strong conceptual framing for missing confounders by treating them as latent states within a StoNet and jointly learning their distribution with treatment and outcome modules. - The proposed method reaches state-of-the-art ATE accuracy on ACIC-2019 and performs strongly on the Twins dataset; performance remains comparatively stable even when a key confounder is removed, indicating robustness of the latent-imputation mechanism.

Weaknesses

1/ Your inference of $Z$ conditions on each datum’s $(A_i,Y_i)$. When both variables are binary while $Z$ is multi-dimensional, recovering $Z$ from $(A,Y)$ alone is under-determined; the latent confounders are generally non-identifiable without extra structure or additional observables. Could you formalize when $(A,Y)$ contain \emph{enough} information about $Z$? 2/ You state that causal effects remain identifiable “under mild conditions” even if $Z$ is only identifiable up to loss-invariant tr

Reviewer 02Rating 4Confidence 4

Strengths

Overall the authors present the idea in a reasonable manner, with reasonable improvement on the performance.

Weaknesses

The assumption of the existence of the underlying stochastic neural networks as inductive bias can be a chick and egg problem on identifiability. Meanwhile, the theoretical justifications are mainly on consistency, instead of convergence rate based on some standard assumptions.

Reviewer 03Rating 4Confidence 3

Strengths

1. The paper tackles an important problem (i.e., missing confounders) in causal inference and proposes a new neural network-based method (CI-StoNet) that overcomes some limitations of prior latent variable approaches, such as limited applicability to nonlinear models and consistency issues. 2. The authors provide theoretical support for the convergence and consistency of the proposed method. 3. Experiments on both simulated and benchmark datasets are conducted to evaluate the performance of the

Weaknesses

1. As noted in the limitations, the method assumes that the underlying causal structure (DAG) is correctly specified, which may be difficult in real applications. Is it possible to provide some experimental results when the DAG is misspecified? 2. The approach involves training deep neural networks with adaptive MCMC, which can be computationally intensive. Complexity analysis or comparison with baselines should be provided. 3. The two types of baselines are inconsistently used in Table 1, S1 an

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Causal Inference Techniques · Bayesian Modeling and Causal Inference · Explainable Artificial Intelligence (XAI)