# Estimation of variance components, heritability and the ridge penalty in   high-dimensional generalized linear models

**Authors:** Jurre R. Veerman, Gwenael G.R. Leday, Mark A. van de Wiel

arXiv: 1902.02623 · 2019-02-08

## TL;DR

This paper compares estimators of variance components and heritability in high-dimensional generalized linear models, introducing efficient methods and demonstrating their superior accuracy over existing approaches.

## Contribution

It introduces a novel maximum marginal likelihood estimator for ridge penalty in high-dimensional GLMs, extending previous methods and providing computational efficiency.

## Key findings

- MML estimator outperforms CV in accuracy for Poisson and Binomial models.
- Robustness of estimators against model departures like sparsity and non-Gaussian errors.
- Software implementation enables reproducibility of results.

## Abstract

For high-dimensional linear regression models, we review and compare several estimators of variances $\tau^2$ and $\sigma^2$ of the random slopes and errors, respectively. These variances relate directly to ridge regression penalty $\lambda$ and heritability index $h^2$, often used in genetics. Direct and indirect estimators of these, either based on cross-validation (CV) or maximum marginal likelihood (MML), are also discussed. The comparisons include several cases of covariate matrix $\mathbf{X}_{n \times p}$, with $p \gg n$, such as multi-collinear covariates and data-derived ones. In addition, we study robustness against departures from the model such as sparse instead of dense effects and non-Gaussian errors.   An example on weight gain data with genomic covariates confirms the good performance of MML compared to CV. Several extensions are presented. First, to the high-dimensional linear mixed effects model, with REML as an alternative to MML. Second, to the conjugate Bayesian setting, which proves to be a good alternative. Third, and most prominently, to generalized linear models for which we derive a computationally efficient MML estimator by re-writing the marginal likelihood as an $n$-dimensional integral. For Poisson and Binomial ridge regression, we demonstrate the superior accuracy of the resulting MML estimator of $\lambda$ as compared to CV. Software is provided to enable reproduction of all results presented here.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1902.02623/full.md

## Figures

23 figures with captions in the complete paper: https://tomesphere.com/paper/1902.02623/full.md

## References

37 references — full list in the complete paper: https://tomesphere.com/paper/1902.02623/full.md

---
Source: https://tomesphere.com/paper/1902.02623