A Distance Covariance-based Kernel for Nonlinear Causal Clustering in Heterogeneous Populations
Alex Markham, Richeek Das, Moritz Grosse-Wentrup

TL;DR
This paper introduces a novel distance covariance-based kernel that effectively measures nonlinear causal structure similarities, enabling clustering of heterogeneous populations and facilitating causal inference within subpopulations.
Contribution
The paper presents a new kernel that is a statistically consistent estimator of nonlinear independence and isometric to causal graphs, advancing causal clustering methods.
Findings
Kernel is a consistent estimator of nonlinear causal independence
Kernel space is isometric to causal ancestral graphs
Demonstrated effectiveness on synthetic and real gene expression data
Abstract
We consider the problem of causal structure learning in the setting of heterogeneous populations, i.e., populations in which a single causal structure does not adequately represent all population members, as is common in biological and social sciences. To this end, we introduce a distance covariance-based kernel designed specifically to measure the similarity between the underlying nonlinear causal structures of different samples. Indeed, we prove that the corresponding feature map is a statistically consistent estimator of nonlinear independence structure, rendering the kernel itself a statistical test for the hypothesis that sets of samples come from different generating causal structures. Even stronger, we prove that the kernel space is isometric to the space of causal ancestral graphs, so that distance between samples in the kernel space is guaranteed to correspond to distance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Gene expression and cancer classification · Bioinformatics and Genomic Networks
