# Subset scanning for multi-trait analysis using GWAS summary statistics

**Authors:** Rui Cao, Evan Olawsky, Edward McFowland, Erin Marcotte, Logan Spector, Tianzhong Yang

PMC · DOI: 10.1093/bioinformatics/btad777 · 2024-01-05

## TL;DR

This paper introduces TraitScan, a new algorithm for multi-trait analysis that improves the identification of genetic associations across many traits using GWAS data.

## Contribution

The novel TraitScan algorithm enables efficient multi-trait analysis with large numbers of traits and supports both individual and summary-level GWAS data.

## Key findings

- TraitScan outperformed existing methods in testing power and trait selection under low or modest sparsity in simulations.
- Application of TraitScan to UK Biobank data identified promising traits associated with Ewing Sarcoma.
- TraitScan was extended to analyze polygenic risk scores and genetically imputed gene expression.

## Abstract

Multi-trait analysis has been shown to have greater statistical power than single-trait analysis. Most of the existing multi-trait analysis methods only work with a limited number of traits and usually prioritize high statistical power over identifying relevant traits, which heavily rely on domain knowledge.

To handle diseases and traits with obscure etiology, we developed TraitScan, a powerful and fast algorithm that identifies potential pleiotropic traits from a moderate or large number of traits (e.g. dozens to thousands) and tests the association between one genetic variant and the selected traits. TraitScan can handle either individual-level or summary-level GWAS data. We evaluated TraitScan using extensive simulations and found that it outperformed existing methods in terms of both testing power and trait selection when sparsity was low or modest. We then applied it to search for traits associated with Ewing Sarcoma, a rare bone tumor with peak onset in adolescence, among 754 traits in UK Biobank. Our analysis revealed a few promising traits worthy of further investigation, highlighting the use of TraitScan for more effective multi-trait analysis as biobanks emerge. We also extended TraitScan to search and test association with a polygenic risk score and genetically imputed gene expression.

Our algorithm is implemented in an R package “TraitScan” available at https://github.com/RuiCao34/TraitScan.

## Linked entities

- **Diseases:** Ewing Sarcoma (MONDO:0012817)

## Full-text entities

- **Diseases:** bone tumor (MESH:D001859), Ewing Sarcoma (MESH:D012512)

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11087659/full.md

---
Source: https://tomesphere.com/paper/PMC11087659