Theory of Optimal Bayesian Feature Filtering
Ali Foroughi pour, Lori A. Dalton

TL;DR
This paper establishes the theoretical foundations of Optimal Bayesian Feature Filtering (OBF), demonstrating its optimality under independence assumptions and its consistency with Gaussian models, thus supporting its use in biomarker discovery.
Contribution
It proves that OBF is uniquely optimal under independence assumptions and shows its consistency under Gaussian models, extending its applicability beyond ideal conditions.
Findings
OBF is optimal if and only if features are mutually independent.
OBF under Gaussian models is consistent with mild conditions.
Supports use of OBF in non-ideal, real-world data scenarios.
Abstract
Optimal Bayesian feature filtering (OBF) is a supervised screening method designed for biomarker discovery. In this article, we prove two major theoretical properties of OBF. First, optimal Bayesian feature selection under a general family of Bayesian models reduces to filtering if and only if the underlying Bayesian model assumes all features are mutually independent. Therefore, OBF is optimal if and only if one assumes all features are mutually independent, and OBF is the only filter method that is optimal under at least one model in the general Bayesian framework. Second, OBF under independent Gaussian models is consistent under very mild conditions, including cases where the data is non-Gaussian with correlated features. This result provides conditions where OBF is guaranteed to identify the correct feature set given enough data, and it justifies the use of OBF in non-design…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsFeature Selection
