Causally-Guided Diffusion for Stable Feature Selection
Arun Vignesh Malarkkan, Xinyuan Wang, Kunpeng Liu, Denghui Zhang, and Yanjie Fu

TL;DR
This paper introduces CGDFS, a novel method that leverages causal invariance and diffusion models to select stable, transferable features under distribution shifts, improving out-of-distribution robustness.
Contribution
We propose a stability-aware posterior inference framework using diffusion models for scalable, uncertainty-aware feature selection that accounts for causal invariance.
Findings
CGDFS outperforms baselines in stability and transferability.
It improves out-of-distribution prediction accuracy.
The method demonstrates robustness across classification and regression tasks.
Abstract
Feature selection is fundamental to robust data-centric AI, but most existing methods optimize predictive performance under a single data distribution. This often selects spurious features that fail under distribution shifts. Motivated by principles from causal invariance, we study feature selection from a stability perspective and introduce Causally-Guided Diffusion for Stable Feature Selection (CGDFS). In CGDFS, we formalized feature selection as approximate posterior inference over feature subsets, whose posterior mass favors low prediction error and low cross-environment variance. Our framework combines three key insights: First, we formulate feature selection as stability-aware posterior sampling. Here, causal invariance serves as a soft inductive bias rather than explicit causal discovery. Second, we train a diffusion model as a learned prior over plausible continuous selection…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis · Gaussian Processes and Bayesian Inference
