Efficient Penalized Generalized Linear Mixed Models for Variable Selection and Genetic Risk Prediction in High-Dimensional Data
Julien St-Pierre, Karim Oualkacha, Sahir Rai Bhatnagar

TL;DR
The paper introduces pglmm, a scalable penalized generalized linear mixed model for variable selection and risk prediction in high-dimensional binary GWAS data, outperforming existing methods in accuracy and efficiency.
Contribution
It develops a novel penalized GLMM approach with an efficient PQL-based algorithm, enabling high-dimensional binary trait analysis with improved predictor selection and prediction accuracy.
Findings
pglmm outperforms penalized LMM and PC-adjusted logistic regression in simulations
pglmm achieves higher predictive accuracy in UK Biobank binary traits
pglmm selects fewer predictors while maintaining performance
Abstract
Sparse regularized regression methods are now widely used in genome-wide association studies (GWAS) to address the multiple testing burden that limits discovery of potentially important predictors. Linear mixed models (LMMs) have become an attractive alternative to principal components (PC) adjustment to account for population structure and relatedness in high-dimensional penalized models. However, their use in binary trait GWAS rely on the invalid assumption that the residual variance does not depend on the estimated regression coefficients. Moreover, LMMs use a single spectral decomposition of the covariance matrix of the responses, which is no longer possible in generalized linear mixed models (GLMMs). We introduce a new method called pglmm, a penalized GLMM that allows to simultaneously select genetic markers and estimate their effects, accounting for between-individual correlations…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetic Associations and Epidemiology · Genetic Mapping and Diversity in Plants and Animals · Cognitive Abilities and Testing
