# Improving accuracy in genome-wide association studies: a two-step approach for handling below limit of detection biomarker measurements

**Authors:** Yaqi A Deng, Torgny Karlsson, Åsa Johansson

PMC · DOI: 10.1093/nargab/lqaf201 · NAR Genomics and Bioinformatics · 2025-12-31

## TL;DR

This paper introduces a two-step method to improve accuracy in genome-wide studies when some biomarker measurements are below detection limits.

## Contribution

The novel two-step Linear-Tobit approach reduces bias in effect estimates caused by below-limit-of-detection measurements.

## Key findings

- The Linear-Tobit method outperforms other models in reducing inflated causal estimates in Mendelian randomization.
- Validation in UK Biobank data confirmed the method's effectiveness across varying proportions of below-LOD measurements.
- The approach improves detection power and accuracy for large-scale biobank datasets.

## Abstract

Advances in high-throughput technologies enable large-scale studies on genomics and molecular phenotypes. However, the trade-off between quality and quantity reduces assay sensitivity, and several measurements in large-scale proteomics and metabolomics analytes fall below the limit of detection (LOD). If not properly addressed, this may introduce bias in effect estimates. To address this, we conducted a simulation study to evaluate the performance of linear, Tobit, Cox, and logistic modeling in the presence of below-LOD measurements in genome-wide association studies. We identified the optimal strategy as a two-step Linear-Tobit scheme, including rapid screening with linear regression followed by refinement with Tobit regression to retrieve accurate effect estimates. This higher accuracy helps mitigate a 1.3-fold and 2.7-fold inflation in causal estimates in a Mendelian randomization (MR) study, which would otherwise be present with 50% and 90% values below LOD. Validation through case studies on estradiol and testosterone levels in the UK Biobank confirmed the simulation results across subgroups with varying proportions of below-LOD measurements. The Linear-Tobit scheme offers optimal detection power and efficiency, with a focus on its applicability to biobank-scale datasets and accuracy in effect estimates to mitigate bias in downstream applications such as MR and polygenic risk scores.

## Full-text entities

- **Chemicals:** estradiol (MESH:D004958), testosterone (MESH:D013739)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12754788/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12754788/full.md

## References

46 references — full list in the complete paper: https://tomesphere.com/paper/PMC12754788/full.md

---
Source: https://tomesphere.com/paper/PMC12754788