# Control of false discoveries in grouped hypothesis testing for eQTL data

**Authors:** Pratyaydipta Rudra, Yi-Hui Zhou, Andrew Nobel, Fred A. Wright

PMC · DOI: 10.1186/s12859-024-05736-3 · BMC Bioinformatics · 2024-04-11

## TL;DR

This paper introduces a new method for controlling false discoveries in eQTL analysis by improving statistical power and computational efficiency.

## Contribution

The novel Z-REG-FDR method enables fast and powerful FDR control for grouped hypothesis testing in eQTL data.

## Key findings

- Z-REG-FDR performs similarly to REG-FDR but is much faster computationally.
- Z-REG-FDR shows favorable statistical power and FDR control compared to existing methods.
- The method is practical for eQTL analysis and similar problems in genomics.

## Abstract

Expression quantitative trait locus (eQTL) analysis aims to detect the genetic variants that influence the expression of one or more genes. Gene-level eQTL testing forms a natural grouped-hypothesis testing strategy with clear biological importance. Methods to control family-wise error rate or false discovery rate for group testing have been proposed earlier, but may not be powerful or easily apply to eQTL data, for which certain structured alternatives may be defensible and may enable the researcher to avoid overly conservative approaches.

In an empirical Bayesian setting, we propose a new method to control the false discovery rate (FDR) for grouped hypotheses. Here, each gene forms a group, with SNPs annotated to the gene corresponding to individual hypotheses. The heterogeneity of effect sizes in different groups is considered by the introduction of a random effects component. Our method, entitled Random Effects model and testing procedure for Group-level FDR control (REG-FDR), assumes a model for alternative hypotheses for the eQTL data and controls the FDR by adaptive thresholding. As a convenient alternate approach, we also propose Z-REG-FDR, an approximate version of REG-FDR, that uses only Z-statistics of association between genotype and expression for each gene-SNP pair. The performance of Z-REG-FDR is evaluated using both simulated and real data. Simulations demonstrate that Z-REG-FDR performs similarly to REG-FDR, but with much improved computational speed.

Our results demonstrate that the Z-REG-FDR method performs favorably compared to other methods in terms of statistical power and control of FDR. It can be of great practical use for grouped hypothesis testing for eQTL analysis or similar problems in statistical genomics due to its fast computation and ability to be fit using only summary data.

The online version contains supplementary material available at 10.1186/s12859-024-05736-3.

## Full-text entities

- **Diseases:** A1 (MESH:C537088)
- **Chemicals:** Z-REG (-)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11007981/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11007981/full.md

## References

40 references — full list in the complete paper: https://tomesphere.com/paper/PMC11007981/full.md

---
Source: https://tomesphere.com/paper/PMC11007981