Test and Measure for Partial Mean Dependence Based on Machine Learning Methods
Leheng Cai, Xu Guo, Wei Zhong

TL;DR
This paper introduces a machine learning-based significance test for partial mean dependence in regression, along with a new measure (pGMC) to quantify this dependence, supported by theoretical properties and empirical validation.
Contribution
It develops a novel significance test for partial mean independence using data splitting and machine learning, and proposes the pGMC measure with proven asymptotic properties.
Findings
Test statistic converges to chi-squared under null hypothesis.
Power enhancement and algorithm stability are achieved.
pGMC estimator has asymptotic normality with optimal convergence rate.
Abstract
It is of importance to investigate the significance of a subset of covariates for the response given covariates in regression modeling. To this end, we propose a significance test for the partial mean independence problem based on machine learning methods and data splitting. The test statistic converges to the standard chi-squared distribution under the null hypothesis while it converges to a normal distribution under the fixed alternative hypothesis. Power enhancement and algorithm stability are also discussed. If the null hypothesis is rejected, we propose a partial Generalized Measure of Correlation (pGMC) to measure the partial mean dependence of given after controlling for the nonlinear effect of . We present the appealing theoretical properties of the pGMC and establish the asymptotic normality of its estimator with the optimal root- convergence rate.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models · Statistical Methods and Inference · Spectroscopy and Chemometric Analyses
