# Large‐Scale Genotype‐Based Trait Imputation With Multi‐Ancestry GWAS Data

**Authors:** Jingchen Ren, Wei Pan

PMC · DOI: 10.1002/gepi.70030 · 2026-01-15

## TL;DR

This paper introduces new methods to improve genetic trait imputation accuracy across diverse populations, especially for Alzheimer's disease in Black individuals.

## Contribution

The paper proposes two novel LS-Imputation variants that integrate multi-ancestry GWAS data to enhance imputation performance.

## Key findings

- Integrating multi-ancestry GWAS data improves trait imputation accuracy.
- LS-Imputation-Transfer achieves the highest performance in imputing AD status in Black individuals.
- The methods were validated using HDL cholesterol and Alzheimer's data from UK Biobank and ADSP.

## Abstract

Genome‐wide association studies (GWAS) have been instrumental in identifying genetic variants associated with complex traits and diseases, including Alzheimer's disease (AD). However, traditional GWAS approaches often focus on European populations, which may lead to loss of power and limit the generalizability of findings across diverse ancestries. On the other hand, LS‐Imputation, a nonparametric trait imputation method, leverages GWAS summary statistics and genotype data to impute missing traits, which can then be used for GWAS and other downstream analyses. Although LS‐Imputation has been applied successfully to European populations, its performance in non‐European populations would be hindered by smaller sample sizes, leading to reduced imputation accuracy. To address these limitations, we propose two novel variants of LS‐Imputation‐LS‐Imputation‐Combined and LS‐Imputation‐Transfer—designed to integrate multi‐ancestry GWAS data and enhance imputation performance. LS‐Imputation‐Combined optimally combines GWAS summary statistics from multiple ancestries, while LS‐Imputation‐Transfer sequentially refines imputed trait values across ancestries using stochastic gradient descent. We evaluate these methods using data from the UK Biobank and the Alzheimer's Disease Sequencing Project (ADSP), first applying them to high‐density lipoprotein (HDL) cholesterol levels as a proof‐of‐concept before focusing on imputing AD status in Black individuals for genetic association analysis. Our results demonstrate that integrating multi‐ancestry GWAS data improves trait imputation accuracy, with LS‐Imputation‐Transfer achieving the highest performance.

## Linked entities

- **Diseases:** Alzheimer's disease (MONDO:0004975)

## Full-text entities

- **Diseases:** AD (MESH:D000544)
- **Chemicals:** cholesterol (MESH:D002784)

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12805644/full.md

---
Source: https://tomesphere.com/paper/PMC12805644