# An assessment of the genomic structural variation landscape in Sub-Saharan African populations

**Authors:** Emma Wiener, Laura Cottino, Gerrit Botha, Oscar Nyangiri, Harry Noyes, Annette McLeod, David Jakubosky, Clement Adebamowo, Phillip Awadalla, Guida Landouré, Mogomotsi Matshaba, Enock Matovu, Michèle Ramsay, Gustave Simo, Martin Simuunza, Caroline Tiemessen, Ambroise Wonkam, Venesa Sahibdeen, Amanda Krause, Zané Lombard, Scott Hazelhurst

PMC · DOI: 10.21203/rs.3.rs-4485126/v1 · 2024-07-08

## TL;DR

This study explores structural variants in Sub-Saharan African genomes, revealing new genetic diversity and providing a dataset for future research.

## Contribution

The study identifies 9.5% novel structural variants in African populations, filling a gap in genomic diversity databases.

## Key findings

- Analysis of 1,091 African genomes identified 67,795 structural variants.
- 10,421 genes were found to have at least one structural variant.
- 6,414 structural variants were novel compared to existing databases.

## Abstract

Structural variants are responsible for a large part of genomic variation between individuals and play a role in both common and rare diseases. Databases cataloguing structural variants notably do not represent the full spectrum of global diversity, particularly missing information from most African populations. To address this representation gap, we analysed 1,091 high-coverage African genomes, 545 of which are public data sets, and 546 which have been analysed for structural variants for the first time. Variants were called using five different tools and datasets merged and jointly called using SURVIVOR. We identified 67,795 structural variants throughout the genome, with 10,421 genes having at least one variant. Using a conservative overlap in merged data, 6,414 of the structural variants (9.5%) are novel compared to the Database of Genomic Variants. This study contributes to knowledge of the landscape of structural variant diversity in Africa and presents a reliable dataset for potential applications in population genetics and health-related research.

## Full-text entities

- **Genes:** PTCH1 (patched 1) [NCBI Gene 5727] {aka BCNS, BCNS1, NBCCS, PTC, PTC1, PTCH}, CDT1 (chromatin licensing and DNA replication factor 1) [NCBI Gene 81620] {aka DUP, RIS2}, PTCH2 (patched 2) [NCBI Gene 8643] {aka PTC2, SLC65B2}, GNPTG (N-acetylglucosamine-1-phosphate transferase subunit gamma) [NCBI Gene 84572] {aka C16orf27, GNPTAG, LP2537, RJD9}
- **Diseases:** CNV (MESH:D000092342), SV (MESH:D002303), LUMPY (MESH:D008166), mucolipidosis III an autosomal recessive condition (MESH:D009081), BND (MESH:D019457), infectious disease (MESH:D003141), cancer (MESH:D009369), CTX (MESH:D019294), rare diseases (MESH:D035583), genetic diseases (MESH:D030342), NDDs (MESH:D002658), Nevoid Basal cell carcinoma (MESH:D001478)
- **Chemicals:** TrypanoGEN (-)
- **Species:** Homo sapiens (human, species) [taxon 9606], Mobula (genus) [taxon 86365]

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11261963/full.md

---
Source: https://tomesphere.com/paper/PMC11261963