Refitted cross-validation estimation for high-dimensional subsamples from low-dimension full data
Haixiang Zhang, HaiYing Wang

TL;DR
This paper introduces a novel subsampling approach combining penalty-based dimension reduction and refitted cross-validation to effectively analyze high-dimensional subsamples from low-dimensional full data, addressing computational and statistical challenges.
Contribution
It proposes a new method for high-dimensional subsamples from low-dimensional data, with theoretical guarantees and practical effectiveness demonstrated through simulations and real data.
Findings
Asymptotic normality of the estimator established
Method outperforms existing approaches in simulations
Effective in real data application
Abstract
The technique of subsampling has been extensively employed to address the challenges posed by limited computing resources and meet the needs for expedite data analysis. Various subsampling methods have been developed to meet the challenges characterized by a large sample size with a small number of parameters. However, direct applications of these subsampling methods may not be suitable when the dimension is also high and available computing facilities at hand are only able to analyze a subsample of size similar or even smaller than the dimension. In this case, although there is no high-dimensional problem in the full data, the subsample may have a sample size smaller or smaller than the number of parameters, making it a high-dimensional problem. We call this scenario the high-dimensional subsample from low-dimension full data problem. In this paper, we tackle this problem by proposing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNuclear reactor physics and engineering · Statistical Methods and Inference
