Model-Augmented Estimation of Conditional Mutual Information for Feature   Selection

Alan Yang; AmirEmad Ghassami; Maxim Raginsky; Negar Kiyavash; and Elyse Rosenbaum

arXiv:1911.04628·cs.LG·June 23, 2020

Model-Augmented Estimation of Conditional Mutual Information for Feature Selection

Alan Yang, AmirEmad Ghassami, Maxim Raginsky, Negar Kiyavash, and Elyse Rosenbaum

PDF

Open Access 1 Repo

TL;DR

This paper introduces a two-step neural network-based method for efficient Markov blanket feature selection by improving conditional independence testing in high-dimensional data.

Contribution

It proposes a novel approach combining neural network mappings with $k$-NN CI testing to enhance feature selection in high-dimensional settings.

Findings

01

Improved CI testing performance on synthetic data.

02

Effective feature selection demonstrated on real datasets.

Abstract

Markov blanket feature selection, while theoretically optimal, is generally challenging to implement. This is due to the shortcomings of existing approaches to conditional independence (CI) testing, which tend to struggle either with the curse of dimensionality or computational complexity. We propose a novel two-step approach which facilitates Markov blanket feature selection in high dimensions. First, neural networks are used to map features to low-dimensional representations. In the second step, CI testing is performed by applying the $k$ -NN conditional mutual information estimator to the learned feature maps. The mappings are designed to ensure that mapped samples both preserve information and share similar information about the target variable if and only if they are close in Euclidean distance. We show that these properties boost the performance of the $k$ -NN estimator in the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

syanga/model-augmented-mutual-information
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Modeling and Causal Inference · Statistical Methods and Inference · Domain Adaptation and Few-Shot Learning

MethodsFeature Selection