Conditional Mutual Information Estimation for Mixed Discrete and   Continuous Variables with Nearest Neighbors

Octavio C\'esar Mesner; Cosma Rohilla Shalizi

arXiv:1912.03387·math.ST·December 10, 2019

Conditional Mutual Information Estimation for Mixed Discrete and Continuous Variables with Nearest Neighbors

Octavio C\'esar Mesner, Cosma Rohilla Shalizi

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new estimator for conditional mutual information that effectively handles mixed discrete and continuous variables, addressing a key gap in fields like public health and social sciences.

Contribution

The paper presents a novel, consistent estimator for mutual and conditional mutual information in mixed data, improving accuracy over existing methods.

Findings

01

Estimator is consistent based on theoretical proof.

02

Simulation results show improved accuracy over existing estimators.

03

Applicable to real-world data with mixed variable types.

Abstract

Fields like public health, public policy, and social science often want to quantify the degree of dependence between variables whose relationships take on unknown functional forms. Typically, in fact, researchers in these fields are attempting to evaluate causal theories, and so want to quantify dependence after conditioning on other variables that might explain, mediate or confound causal relations. One reason conditional mutual information is not more widely used for these tasks is the lack of estimators which can handle combinations of continuous and discrete random variables, common in applications. This paper develops a new method for estimating mutual and conditional mutual information for data samples containing a mix of discrete and continuous variables. We prove that this estimator is consistent and show, via simulation, that it is more accurate than similar estimators.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

omesner/knncmi
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Modeling and Causal Inference · Gene Regulatory Network Analysis · Statistical Methods and Inference