# Archipelago Method for Variant Set Association Test Statistics

**Authors:** Dylan Lawless, Ali Saadat, Mariam Ait Oumelloul, Luregn J. Schlapbach, Jacques Fellay

PMC · DOI: 10.1002/gepi.70025 · Genetic Epidemiology · 2026-01-06

## TL;DR

The Archipelago method creates a visual tool to better understand genetic variant associations by combining individual and set-level data in a single plot.

## Contribution

Introduces Archipelago, a novel visualization method for variant set association tests that assigns genomic coordinates to P values.

## Key findings

- Archipelago enables intuitive visualization of both set-level and individual variant associations.
- Validation studies showed effectiveness across simulated and real datasets, including biobank-scale cohorts.
- The method integrates GWAS and rare-variant collapse data for clearer interpretation of genetic results.

## Abstract

Variant set association tests (VSAT), especially those incorporating rare variants via variant collapse, are invaluable in genetic studies. However, unlike Manhattan plots for single‐variant tests, VSAT statistics lack intrinsic genomic coordinates, hindering visual interpretation. To overcome this, we developed the Archipelago method, which assigns a meaningful genomic coordinate to VSAT P values so that both set‐level and individual variant associations can be visualised together. This results in an intuitive and information rich illustration akin to an Archipelago of clustered islands, enhancing the understanding of both collective and individual impacts of variants. We conducted three validation studies spanning simulated and real datasets across small and biobank‐scale cohorts, from 504 individuals up to 490,640 UK Biobank participants. We integrated single‐variant genome‐wide association studies (GWAS) with gene‐ and protein pathway‐level rare‐variant collapse. These studies included the 1KG GWAS cohort, the Pan‐UK Biobank GWAS with DeepRVAT WES gene‐level study, and the UKBB WGS gene‐level UTR collapsing PheWAS. The Archipelago plot is applicable in any genetic association study that uses variant collapse to evaluate both individual variants and variant sets, and its customisability facilitates clear communication of complex genetic data. By integrating at least two dimensions of genetic data into a single visualisation, VSAT results can be easily read and aid in identification of potential causal variants in variant sets such as protein pathways.

## Full-text entities

- **Genes:** HBG2 (hemoglobin subunit gamma 2) [NCBI Gene 3048] {aka HBG-T1, TNCY}, MOB3C (MOB kinase activator 3C) [NCBI Gene 148932] {aka MOB1E, MOBKL2C}, CES1 (carboxylesterase 1) [NCBI Gene 1066] {aka ACAT, CE-1, CEH, CES2, HMSE, HMSE1}, LPL (lipoprotein lipase) [NCBI Gene 4023] {aka HDLCQ11, LIPD}, HBB (hemoglobin subunit beta) [NCBI Gene 3043] {aka CD113t-C, ECYT6, beta-globin}, PLEKHO2 (pleckstrin homology domain containing O2) [NCBI Gene 80301] {aka PLEKHQ1, PP1628, pp9099}, DOCK5 (dedicator of cytokinesis 5) [NCBI Gene 80005], GP1BA (glycoprotein Ib platelet subunit alpha) [NCBI Gene 2811] {aka BDPLT1, BDPLT3, BSS, CD42B, CD42b-alpha, DBPLT3}, PEAR1 (platelet endothelial aggregation receptor 1) [NCBI Gene 375033] {aka JEDI, MEGF12}, HBE1 (hemoglobin subunit epsilon 1) [NCBI Gene 3046] {aka HBE}, APOA5 (apolipoprotein A5) [NCBI Gene 116519] {aka APOAV, RAP3}, OR51B5 (olfactory receptor family 51 subfamily B member 5) [NCBI Gene 282763] {aka HOR5'Beta5, OR11-37}
- **Diseases:** heart failure (MESH:D006333)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12771271/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12771271/full.md

## References

32 references — full list in the complete paper: https://tomesphere.com/paper/PMC12771271/full.md

---
Source: https://tomesphere.com/paper/PMC12771271