A Fast Kernel-based Conditional Independence test with Application to Causal Discovery

Oliver Schacht; Biwei Huang

arXiv:2505.11085·cs.LG·December 5, 2025

A Fast Kernel-based Conditional Independence test with Application to Causal Discovery

Oliver Schacht, Biwei Huang

PDF

Open Access 3 Reviews

TL;DR

FastKCI is a scalable, parallel kernel-based conditional independence test that significantly reduces computation time while maintaining statistical power, enabling causal discovery in large datasets.

Contribution

We introduce FastKCI, a novel parallelizable method that accelerates kernel-based conditional independence testing using dataset partitioning and importance sampling.

Findings

01

FastKCI achieves substantial speedups over traditional KCI tests.

02

It maintains comparable statistical power to the original KCI.

03

Validated on synthetic and real-world datasets.

Abstract

Kernel-based conditional independence (KCI) testing is a powerful nonparametric method commonly employed in causal discovery tasks. Despite its flexibility and statistical reliability, cubic computational complexity limits its application to large datasets. To address this computational bottleneck, we propose \textit{FastKCI}, a scalable and parallelizable kernel-based conditional independence test that utilizes a mixture-of-experts approach inspired by embarrassingly parallel inference techniques for Gaussian processes. By partitioning the dataset based on a Gaussian mixture model over the conditioning variables, FastKCI conducts local KCI tests in parallel, aggregating the results using an importance-weighted sampling scheme. Experiments on synthetic datasets and benchmarks on real-world production data validate that FastKCI maintains the statistical power of the original KCI test…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 6Confidence 3

Strengths

- The paper presents a strong motivation and a practical approach for improving the efficiency of conditional independence tests, and consequently, constraint-based causal discovery. - The paper is well-written and clearly presented.

Weaknesses

- Missing relative references. Please see the question 1 and 2 in Questions section. - This paper provides limited evaluation of generality: missing real-world applications, limited settings on synthetic data. Please see questions 5, 6, and 7 in Questions section.

Reviewer 02Rating 6Confidence 3

Strengths

S1 The considered problem is interesting and relevant. S2 The paper is technically sound and overall well-written. S3 To reduce complexity by 1/V^2 is impressive. S3 The evaluation and comparison against baselines is convincing.

Weaknesses

W1 The scalability regarding the number of involved variables remains somewhat unclear. W2 Limitations of the paper’s methods could be better discussed.

Reviewer 03Rating 4Confidence 4

Strengths

FastKCI substantially reduces computational time in most cases without significantly compromising the accuracy of the CITs.

Weaknesses

1. The paper merely combines embarrassingly parallel inference with KCI in a straightforward manner, leading to limited novelty. 2. Although FastKCI achieves faster computation, the experimental results show that the accuracy of the CITs decreases in certain cases. Moreover, the paper only provides a coarse theoretical analysis for the asymptotic behavior of FastKCI as $n \to \infty$, which fails to explain the observed loss in accuracy. 3. The experiments compare FastKCI only with other KCI

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Modeling and Causal Inference · Gaussian Processes and Bayesian Inference · Bayesian Methods and Mixture Models

MethodsCausal inference