On block-wise and reference panel-based estimators for genetic data prediction in high dimensions
Bingxin Zhao, Shurong Zheng, and Hongtu Zhu

TL;DR
This paper analyzes the effectiveness of block-wise and reference panel-based estimators for genetic prediction in high-dimensional data, revealing that local dependence adjustments may be less accurate than whole covariance methods, with performance varying based on data access.
Contribution
It provides a unified theoretical framework for evaluating block-wise and reference panel estimators in high-dimensional genetic prediction without sparsity assumptions.
Findings
Block-wise estimators can be less accurate than whole covariance estimators.
Performance of estimators varies significantly between training data and reference panels.
Theoretical results are supported by simulations and UK Biobank data analysis.
Abstract
Genetic prediction of complex traits and diseases has attracted enormous attention in precision medicine, mainly because it has the potential to translate discoveries from genome-wide association studies (GWAS) into medical advances. As the high dimensional covariance matrix (or the linkage disequilibrium (LD) pattern) of genetic variants has a block-diagonal structure, many existing methods attempt to account for the dependence among variants in predetermined local LD blocks/regions. Moreover, due to privacy restrictions and data protection concerns, genetic variant dependence in each LD block is typically estimated from external reference panels rather than the original training dataset. This paper presents a unified analysis of block-wise and reference panel-based estimators in a high-dimensional prediction framework without sparsity restrictions. We find that, surprisingly, even…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetic Associations and Epidemiology · Spatial and Panel Data Analysis · Advanced Causal Inference Techniques
