plmmr: an R package to fit penalized linear mixed models for genome-wide association data with complex correlation structure
Tabitha K. Peter, Anna C. Reisetter, Yujing Lu, Oscar A. Rysavy, Patrick J. Breheny

TL;DR
The paper introduces plmmr, an R package for fitting penalized linear mixed models to high-dimensional genome-wide association data, effectively estimating complex correlations and enabling analysis of large datasets on standard computers.
Contribution
It provides a novel open-source R package that estimates correlations in high-dimensional data and improves prediction in GWAS using memory-mapping for large datasets.
Findings
Successfully analyzed genome-scale data exceeding RAM capacity
Improved prediction accuracy in GWAS datasets
Demonstrated computational efficiency with real data examples
Abstract
Correlation among the observations in high-dimensional regression modeling can be a major source of confounding. We present a new open-source package, plmmr, to implement penalized linear mixed models in R. This R package estimates correlation among observations in high-dimensional data and uses those estimates to improve prediction with the best linear unbiased predictor. The package uses memory-mapping so that genome-scale data can be analyzed on ordinary machines even if the size of data exceeds RAM. We present here the methods, workflow, and file-backing approach upon which plmmr is built, and we demonstrate its computational capabilities with two examples from real GWAS data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
