A Rigorous Information-Theoretic Definition of Redundancy and Relevancy in Feature Selection Based on (Partial) Information Decomposition
Patricia Wollstadt, Sebastian Schmitt, Michael Wibral

TL;DR
This paper introduces a rigorous, information-theoretic framework for feature selection that accounts for feature interactions like redundancy and synergy using partial information decomposition, and proposes a practical CMI-based algorithm.
Contribution
It provides the first rigorous, PID-based definition of feature relevancy and redundancy, and develops an iterative CMI algorithm for effective feature selection.
Findings
CMI maximizes relevancy and minimizes redundancy in feature selection.
The proposed algorithm outperforms unconditional mutual information methods on benchmarks.
PID estimates quantify feature contributions and interactions effectively.
Abstract
Selecting a minimal feature set that is maximally informative about a target variable is a central task in machine learning and statistics. Information theory provides a powerful framework for formulating feature selection algorithms -- yet, a rigorous, information-theoretic definition of feature relevancy, which accounts for feature interactions such as redundant and synergistic contributions, is still missing. We argue that this lack is inherent to classical information theory which does not provide measures to decompose the information a set of variables provides about a target into unique, redundant, and synergistic contributions. Such a decomposition has been introduced only recently by the partial information decomposition (PID) framework. Using PID, we clarify why feature selection is a conceptually difficult problem when approached using information theory and provide a novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Fault Detection and Control Systems · Evolutionary Algorithms and Applications
MethodsFeature Selection
