fastHDMI: Fast Mutual Information Estimation for High-Dimensional Data
Kai Yang, Masoud Asgharian, Nikhil Bhagwat, Jean-Baptiste Poline,, Celia M.T. Greenwood

TL;DR
fastHDMI is a Python package that enables efficient mutual information estimation for variable selection in high-dimensional neuroimaging data, improving analysis of complex data structures.
Contribution
This work introduces fastHDMI, a novel Python toolkit applying three mutual information estimation methods specifically for neuroimaging variable selection.
Findings
FFT-KDE method outperforms others for nonlinear continuous outcomes
Binning-based methods excel for binary outcomes with nonlinear probability preimages
FastHDMI demonstrates computational efficiency and practical utility in neuroimaging analysis
Abstract
In this paper, we introduce fastHDMI, a Python package designed for efficient variable screening in high-dimensional datasets, particularly neuroimaging data. This work pioneers the application of three mutual information estimation methods for neuroimaging variable selection, a novel approach implemented via fastHDMI. These advancements enhance our ability to analyze the complex structures of neuroimaging datasets, providing improved tools for variable selection in high-dimensional spaces. Using the preprocessed ABIDE dataset, we evaluate the performance of these methods through extensive simulations. The tests cover a range of conditions, including linear and nonlinear associations, as well as continuous and binary outcomes. Our results highlight the superiority of the FFTKDE-based mutual information estimation for feature screening in continuous nonlinear outcomes, while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCCD and CMOS Imaging Sensors · Advanced Data Compression Techniques · Parallel Computing and Optimization Techniques
