High-dimensional Variable Screening via Conditional Martingale Difference Divergence
Lei Fang, Qingcong Yuan, Xiangrong Yin, Chenglong Ye

TL;DR
This paper introduces a new kernel-based variable screening method using the conditional martingale difference divergence, which effectively handles ultrahigh-dimensional data with correlated predictors and heterogeneous responses, ensuring stability and efficiency.
Contribution
It proposes a novel independence measure, CMDH, and a model-free screening method that improves stability and computational efficiency in high-dimensional variable selection.
Findings
The proposed method outperforms existing screening techniques in simulations.
It maintains stability under high correlation among predictors.
Demonstrates superior performance on real data applications.
Abstract
Variable screening has been a useful research area that deals with ultrahigh-dimensional data. When there exist both marginally and jointly dependent predictors to the response, existing methods such as conditional screening or iterative screening often suffer from instability against the selection of the conditional set or the computational burden, respectively. In this article, we propose a new independence measure, named conditional martingale difference divergence (CMDH), that can be treated as either a conditional or a marginal independence measure. Under regularity conditions, we show that the sure screening property of CMDH holds for both marginally and jointly active variables. Based on this measure, we propose a kernel-based model-free variable screening method, which is efficient, flexible, and stable against high correlation among predictors and heterogeneity of the response.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Bayesian Methods and Mixture Models · Gene expression and cancer classification
