Statistical Inference for Sequential Feature Selection after Domain Adaptation
Duong Tan Loc, Nguyen Thang Loi, Vo Nguyen Le Duy

TL;DR
This paper introduces a new statistical testing method for sequential feature selection after domain adaptation in high-dimensional regression, ensuring controlled false positive rates and improved power, validated through extensive experiments.
Contribution
It proposes a novel testing approach for SeqFS-DA that guarantees FPR control and enhances statistical power, extending to model selection criteria like AIC, BIC, and adjusted R-squared.
Findings
The method effectively controls false positive rates below the significance level.
It demonstrates superior performance over existing approaches in synthetic and real datasets.
Extensions to model selection criteria improve practical applicability.
Abstract
In high-dimensional regression, feature selection methods, such as sequential feature selection (SeqFS), are commonly used to identify relevant features. When data is limited, domain adaptation (DA) becomes crucial for transferring knowledge from a related source domain to a target domain, improving generalization performance. Although SeqFS after DA is an important task in machine learning, none of the existing methods can guarantee the reliability of its results. In this paper, we propose a novel method for testing the features selected by SeqFS-DA. The main advantage of the proposed method is its capability to control the false positive rate (FPR) below a significance level (e.g., 0.05). Additionally, a strategic approach is introduced to enhance the statistical power of the test. Furthermore, we provide extensions of the proposed method to SeqFS with model selection…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification
MethodsFeature Selection
