# Hierarchical joint analysis of marginal summary statistics—Part II: High‐dimensional instrumental analysis of omics data

**Authors:** Lai Jiang, Jiayi Shen, Burcu F. Darst, Christopher A. Haiman, Nicholas Mancuso, David V. Conti

PMC · DOI: 10.1002/gepi.22577 · Genetic Epidemiology · 2024-06-17

## TL;DR

This paper introduces a new method for analyzing large-scale genetic and omics data to infer causal relationships in diseases like prostate cancer.

## Contribution

The novel contribution is a scalable hierarchical model (SHA-JAM) for high-dimensional instrumental variable analysis using omics data.

## Key findings

- SHA-JAM outperforms existing methods in accuracy and speed for high-dimensional data analysis.
- The method was successfully applied to prostate cancer data with over 140,000 individuals and high-dimensional metabolite and transcriptome data.
- Simulation studies showed improved performance in terms of AUC and mean-squared error.

## Abstract

Instrumental variable (IV) analysis has been widely applied in epidemiology to infer causal relationships using observational data. Genetic variants can also be viewed as valid IVs in Mendelian randomization and transcriptome‐wide association studies. However, most multivariate IV approaches cannot scale to high‐throughput experimental data. Here, we leverage the flexibility of our previous work, a hierarchical model that jointly analyzes marginal summary statistics (hJAM), to a scalable framework (SHA‐JAM) that can be applied to a large number of intermediates and a large number of correlated genetic variants—situations often encountered in modern experiments leveraging omic technologies. SHA‐JAM aims to estimate the conditional effect for high‐dimensional risk factors on an outcome by incorporating estimates from association analyses of single‐nucleotide polymorphism (SNP)‐intermediate or SNP‐gene expression as prior information in a hierarchical model. Results from extensive simulation studies demonstrate that SHA‐JAM yields a higher area under the receiver operating characteristics curve (AUC), a lower mean‐squared error of the estimates, and a much faster computation speed, compared to an existing approach for similar analyses. In two applied examples for prostate cancer, we investigated metabolite and transcriptome associations, respectively, using summary statistics from a GWAS for prostate cancer with more than 140,000 men and high dimensional publicly available summary data for metabolites and transcriptomes.

## Linked entities

- **Diseases:** prostate cancer (MONDO:0005159)

## Full-text entities

- **Diseases:** prostate cancer (MESH:D011471)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12333930/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12333930/full.md

## References

82 references — full list in the complete paper: https://tomesphere.com/paper/PMC12333930/full.md

---
Source: https://tomesphere.com/paper/PMC12333930