Measuring Differences between Conditional Distributions using Kernel Embeddings
Peter Moskvichev, Siu Lun Chau, Dino Sejdinovic

TL;DR
This paper develops a unified kernel-based framework, called CMMD, for measuring divergence between conditional distributions, with theoretical insights and a new estimator, enhancing statistical testing of complex dependencies.
Contribution
It introduces a coherent framework for kernel-based conditional distribution comparison, including a novel doubly robust estimator and theoretical connections between different CMMD levels.
Findings
CMMD effectively captures complex conditional dependencies.
The new estimator maintains consistency under model misspecification.
Numerical experiments validate the effectiveness of CMMD in statistical testing.
Abstract
Comparing conditional distributions is a fundamental challenge in statistics and machine learning, with applications across a wide range of domains. While proposed methods for measuring discrepancies using kernel embeddings of distributions in a reproducing kernel Hilbert space (RKHS) provide powerful non-parametric techniques, the existing literature remains fragmented and lacks a unified theoretical treatment. This paper addresses this gap by establishing a coherent framework for studying kernel-based methods to measure divergence between conditional distributions through what we refer to as conditional maximum mean discrepancy (CMMD). The CMMD consists of a family of metrics which we call levels, with three special cases each using a different type of RKHS embedding: CMMD (conditional mean operators), CMMD (conditional mean embeddings), and CMMD (joint mean embeddings).…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
