Conditional Mutual Information Estimation for Mixed Discrete and Continuous Variables with Nearest Neighbors
Octavio C\'esar Mesner, Cosma Rohilla Shalizi

TL;DR
This paper introduces a new estimator for conditional mutual information that effectively handles mixed discrete and continuous variables, addressing a key gap in fields like public health and social sciences.
Contribution
The paper presents a novel, consistent estimator for mutual and conditional mutual information in mixed data, improving accuracy over existing methods.
Findings
Estimator is consistent based on theoretical proof.
Simulation results show improved accuracy over existing estimators.
Applicable to real-world data with mixed variable types.
Abstract
Fields like public health, public policy, and social science often want to quantify the degree of dependence between variables whose relationships take on unknown functional forms. Typically, in fact, researchers in these fields are attempting to evaluate causal theories, and so want to quantify dependence after conditioning on other variables that might explain, mediate or confound causal relations. One reason conditional mutual information is not more widely used for these tasks is the lack of estimators which can handle combinations of continuous and discrete random variables, common in applications. This paper develops a new method for estimating mutual and conditional mutual information for data samples containing a mix of discrete and continuous variables. We prove that this estimator is consistent and show, via simulation, that it is more accurate than similar estimators.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Gene Regulatory Network Analysis · Statistical Methods and Inference
