CCMI : Classifier based Conditional Mutual Information Estimation
Sudipto Mukherjee, Himanshu Asnani, Sreeram Kannan

TL;DR
This paper introduces classifier-based methods for estimating Conditional Mutual Information (CMI) that outperform traditional estimators, especially in high-dimensional settings, and demonstrates their effectiveness in conditional independence testing.
Contribution
The paper proposes novel CMI estimators leveraging classifiers and generative models, overcoming the curse of dimensionality and improving accuracy over existing methods.
Findings
Estimates remain accurate with increasing dimension.
Significant improvement over KSG estimator.
Superior performance in conditional independence testing.
Abstract
Conditional Mutual Information (CMI) is a measure of conditional dependence between random variables X and Y, given another random variable Z. It can be used to quantify conditional dependence among variables in many data-driven inference problems such as graphical models, causal learning, feature selection and time-series analysis. While k-nearest neighbor (kNN) based estimators as well as kernel-based methods have been widely used for CMI estimation, they suffer severely from the curse of dimensionality. In this paper, we leverage advances in classifiers and generative models to design methods for CMI estimation. Specifically, we introduce an estimator for KL-Divergence based on the likelihood ratio by training a classifier to distinguish the observed joint distribution from the product distribution. We then show how to construct several CMI estimators using this basic divergence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Gaussian Processes and Bayesian Inference · Machine Learning and Data Classification
MethodsFeature Selection
