Using Markov Boundary Approach for Interpretable and Generalizable Feature Selection
Anwesha Bhattacharyya, Yaqun Wang, Joel Vaughan, and Vijayan N. Nair

TL;DR
This paper introduces a multi-group forward-backward selection method for identifying Markov boundaries in complex data, enhancing feature selection for more interpretable and generalizable machine learning models.
Contribution
It proposes a novel strategy to accurately identify Markov boundaries in non-linear and mixed data types, addressing limitations of existing methods.
Findings
Effective in simulated datasets
Demonstrates improved feature selection accuracy
Applicable to real-world datasets
Abstract
The perceived advantage of machine learning (ML) models is that they are flexible and can incorporate a large number of features. However, many of these are typically correlated or dependent, and incorporating all of them can hinder model stability and generalizability. In fact, it is desirable to do some form of feature screening and incorporate only the relevant features. The best approaches should involve subject-matter knowledge and information on causal relationships. This paper deals with an approach called Markov boundary (MB) that is related to causal discovery, using directed acyclic graphs to represent potential relationships and using statistical tests to determine the connections. An MB is the minimum set of features that guarantee that other potential predictors do not affect the target given the boundary while ensuring maximal predictive accuracy. Identifying the Markov…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Neural Networks and Applications · Fault Detection and Control Systems
