U-learning for Prediction Inference via Combinatory Multi-Subsampling: With Applications to LASSO and Neural Networks
Zhe Fei, Yi Li

TL;DR
This paper introduces a U-learning method using combinatory multi-subsampling for ensemble prediction and confidence interval construction in high-dimensional settings, applicable to LASSO and neural networks, with applications to epigenetic aging clocks.
Contribution
It presents a novel U-learning framework that leverages combinatory multi-subsampling and generalized U-statistics for valid inference in high-dimensional prediction models.
Findings
Valid confidence intervals for predictions are constructed.
The method is successfully applied to DNA methylation age prediction.
Numerical studies confirm the approach's effectiveness.
Abstract
Epigenetic aging clocks play a pivotal role in estimating an individual's biological age through the examination of DNA methylation patterns at numerous CpG (Cytosine-phosphate-Guanine) sites within their genome. However, making valid inferences on predicted epigenetic ages, or more broadly, on predictions derived from high-dimensional inputs, presents challenges. We introduce a novel U-learning approach via combinatory multi-subsampling for making ensemble predictions and constructing confidence intervals for predictions of continuous outcomes when traditional asymptotic methods are not applicable. More specifically, our approach conceptualizes the ensemble estimators within the framework of generalized U-statistics and invokes the H\'ajek projection for deriving the variances of predictions and constructing confidence intervals with valid conditional coverage probabilities. We apply…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms
