Causality-based Feature Selection: Methods and Evaluations
Kui Yu, Xianjie Guo, Lin Liu, Jiuyong Li, Hao Wang, Zhaolong Ling,, Xindong Wu

TL;DR
This paper reviews recent causality-based feature selection methods, introduces an open-source package for their implementation, and conducts extensive experiments to compare their performance on synthetic and real data.
Contribution
It provides the first comprehensive review of causality-based feature selection and releases an open-source toolkit for researchers to develop and evaluate new algorithms.
Findings
Causal relationships can improve model interpretability and robustness.
Extensive experiments compare various causality-based feature selection algorithms.
Future challenges in causality-based feature selection are discussed.
Abstract
Feature selection is a crucial preprocessing step in data analytics and machine learning. Classical feature selection algorithms select features based on the correlations between predictive features and the class variable and do not attempt to capture causal relationships between them. It has been shown that the knowledge about the causal relationships between features and the class variable has potential benefits for building interpretable and robust prediction models, since causal relationships imply the underlying mechanism of a system. Consequently, causality-based feature selection has gradually attracted greater attentions and many algorithms have been proposed. In this paper, we present a comprehensive review of recent advances in causality-based feature selection. To facilitate the development of new algorithms in the research area and make it easy for the comparisons between…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Multi-Criteria Decision Making · Machine Learning and Data Classification
MethodsFeature Selection
