# An Updated Polygenic Index Repository: Expanded Phenotypes, New Cohorts, and Improved Causal Inference

**Authors:** Robel Alemu, Anastasia Terskaya, Matthew Howell, Junming Guan, Harry Sands, Aaron Kleinman, David Bann, Tim Morris, George B. Ploubidis, Emla Fitzsimons, Kathleen Mullan Harris, Avshalom Caspi, David L. Corcoran, Terrie E. Moffitt, Richie Poulton, Karen Sugden, Benjamin S. Williams, Andrew Steptoe, Olesya Ajnakina, Uku Vainik, Tõnu Esko, Archie Campbell, Caroline Hayward, William G. Iacono, Matt McGue, Robert F. Krueger, Anna R. Docherty, Andrey A. Shabalin, Ralph Hertwig, Philipp Koellinger, David Richter, Jan Goebel, Rafael Ahlskog, Sven Oskarsson, Patrik K.E. Magnusson, K. Paige Harden, Elliot M. Tucker-Drob, Charlotte K. L. Pahnke, Carlo Maj, Frank M. Spinath, Pamela Herd, Jeremy Freese, David Laibson, Michelle N. Meyer, Jonathan Jala, David Cesarini, Alexander Strudwick Young, Patrick Turley, Daniel J. Benjamin, Aysu Okbay

PMC · DOI: 10.21203/rs.3.rs-7828579/v1 · Research Square · 2025-10-13

## TL;DR

This paper introduces an updated version of a repository for DNA-based predictors of traits, expanding coverage and improving accuracy with new data and methods.

## Contribution

The paper introduces Version 2 of the Polygenic Index Repository with expanded phenotypes, new datasets, and improved causal inference methods.

## Key findings

- The repository now includes 61 phenotypes and 20 datasets, up from 47 and 11 in the previous version.
- PGIs for 16 phenotypes were improved using updated GWAS meta-analysis with greater statistical power.
- Parental PGIs were introduced to reduce confounding bias in family-based analyses.

## Abstract

Polygenic indexes (PGIs) — DNA-based predictors of individual phenotypes — have become essential tools across biomedical and social sciences. We introduce Version 2 of the Polygenic Index Repository, which expands phenotype coverage from 47 to 61, increases the number of participating datasets from 11 to 20, and adopts a more consistent and improved methodology for PGI construction. For 16 phenotypes, we leverage summary statistics from an updated GWAS meta-analysis with greater statistical power compared to the original release, thereby improving the PGI’s predictive power. To improve power for family-based analyses, we provide imputed parental PGIs in all datasets with first-degree relatives and offer a framework for interpreting results from analyses that control for parental PGIs. We illustrate the utility of parental PGIs using two applications: (1) comparing PGI associations with and without parental PGI controls for all phenotypes in two Repository datasets with family data, and (2) for BMI and diastolic blood pressure, exploring the contribution of causal versus non-causal components of PGI associations to the imperfect portability of PGIs across subgroups within a genetic ancestry. Collectively, the updates enhance predictive performance, broaden the Repository’s scope, and introduce novel resources that reduce confounding bias and improve interpretability.

## Full-text entities

- **Genes:** APOE (apolipoprotein E) [NCBI Gene 348] {aka AD2, APO-E, ApoE4, LDLCQ5, LPG}, BMI1 (BMI1 proto-oncogene, polycomb ring finger) [NCBI Gene 648] {aka FLVI2/BMI1, PCGF4, RNF51, flvi-2/bmi-1}, PTGIS (prostaglandin I2 synthase) [NCBI Gene 5740] {aka CYP8, CYP8A1, PGIS, PTGI}, CD69 (CD69 molecule) [NCBI Gene 969] {aka AIM, BL-AC/P26, CLEC2C, EA1, GP32/28, MLR-3}, CLA3 (cerebellar ataxia 3 (cerebellar parenchyma disorder 1)) [NCBI Gene 1167] {aka CPD1, SCAR6}, HLA-C (major histocompatibility complex, class I, C) [NCBI Gene 3107] {aka D6S204, HLA-JY3, HLAC, HLC-C, MHC, PSORS1}, HDL3 (Huntington-like neurodegenerative disorder 2) [NCBI Gene 53369] {aka HLN2}
- **Diseases:** migraine (MESH:D008881), mental disorders (MESH:D001523), Anorexia Nervosa (MESH:D000856), hay fever (MESH:D006255), asthma (MESH:D001249), Attention Deficit and Hyperactivity Disorder (MESH:D001289), Autism Spectrum Disorder (MESH:D000067877), ID (MESH:C537985), use (MESH:D019966), IBD (MESH:D009105), Cognitive Empathy (MESH:D003072), COPD (MESH:D029424), sex chromosome aneuploidy (MESH:D025064), eczema (MESH:D004485), Alzheimer's (MESH:D000544), insomnia (MESH:D007319), rhinitis (MESH:D012220), PGI (MESH:C566784), Coronary Artery Disease (MESH:D003324)
- **Chemicals:** lipid (MESH:D008055), MTAG (-), triglycerides (MESH:D014280)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12633171/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12633171/full.md

## References

63 references — full list in the complete paper: https://tomesphere.com/paper/PMC12633171/full.md

---
Source: https://tomesphere.com/paper/PMC12633171