A generalization of moderated statistics to data adaptive semiparametric estimation in high-dimensional biology
Nima S. Hejazi, Philippe Boileau, Mark J. van der Laan, and Alan E., Hubbard

TL;DR
This paper introduces a generalized empirical Bayes shrinkage method for variance estimation in high-dimensional biological data, improving stability and reliability of causal inference and biomarker discovery in limited sample scenarios.
Contribution
It extends existing variance estimators with a data adaptive approach, enhancing inferential stability in high-dimensional, low-sample biological studies.
Findings
Improved variance estimation stability in high-dimensional settings.
Enhanced detection of causal biomarkers in biological data.
Validated approach on DNA methylation data related to smoking.
Abstract
The widespread availability of high-dimensional biological data has made the simultaneous screening of many biological characteristics a central problem in computational biology and allied sciences. While the dimensionality of such datasets continues to grow, so too does the complexity of biomarker identification from exposure patterns in health studies measuring baseline confounders; moreover, doing so while avoiding model misspecification remains an issue only partially addressed. Efficient estimators capable of incorporating flexible, data adaptive regression techniques in estimating relevant components of the data-generating distribution provide an avenue for avoiding model misspecification; however, in the context of high-dimensional problems that require the simultaneous estimation of numerous parameters, standard variance estimators have proven unstable, resulting in unreliable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Gene expression and cancer classification · Advanced Causal Inference Techniques
