# Chromosome-scale genome assembly and annotation of two geographically distinct strains of malaria vector Anopheles albimanus

**Authors:** Dieunel Derilus, Gareth D. Weedall, Michael W. Vandewege, Dhwani Batra, Mili Sheth, Lori A. Rowe, Ananias A. Escalante, Audrey Lenhart, Lucy Mackenzie Impoinvil

PMC · DOI: 10.1038/s41598-025-01713-9 · Scientific Reports · 2025-06-03

## TL;DR

This study creates high-quality genome assemblies for two geographically distinct strains of the malaria vector Anopheles albimanus to better understand their genetic differences and biology.

## Contribution

The study provides improved chromosome-scale genome assemblies for two Anopheles albimanus strains using hybrid sequencing technologies.

## Key findings

- The genomes of Stecla and Cartagena strains were assembled with high completeness and fewer gaps compared to previous assemblies.
- The two strains share 98.12% pairwise identity and conserved gene positions, indicating minimal structural divergence.
- Long-read sequencing captured more transposable elements, enriching the understanding of repetitive genomic content.

## Abstract

Anopheles albimanus is one of the principal malaria vectors in the Americas and exhibits phenotypic variation across its geographic distribution. High-quality reference genomes from geographically distant populations are essential to deepen our understanding of the biology, evolution, and genetic variation of this important malaria vector. In this study, we applied long-read PacBio and short-read Illumina sequencing technologies to assemble the complete genomes of two reference strains of An. albimanus, Stecla (originating from El Salvador), and Cartagena (originating from Colombia); and investigated the structural features of these genomes, including gene content, transposable elements (TEs), genetic variation, and structural rearrangements. Our hybrid assembly approach generated reference-quality genomes for each strain and recovered ~ 96% of the expected genome size. The genome assemblies of Stecla and Cartagena consisted of 109 and 149 scaffolds, with estimated genome sizes of 167.5 Mbp (N50 = 88 Mbp) and 167.1 Mbp (N50 = 87 Mbp), respectively. They exhibited a high level of completeness and contained a smaller number of gaps and ambiguous bases than either of the two previously published reference genomes for this species, suggesting a considerable improvement in the quality and completeness of the assemblies. A total of 12,082 and 12,120 protein-coding genes were predicted in Stecla and Cartagena, respectively. TE analyses indicated more repetitive content was captured in the long read assemblies. The assembled genomes shared 98.12% pairwise identity and synteny analyses suggested that gene position was conserved between both strains. These newly assembled genomes will serve as an important resource for future research in comparative genomics, proteomics, epigenetics, transcriptomics, and functional analysis of this important malaria vector.

## Linked entities

- **Diseases:** malaria (MONDO:0005136)
- **Species:** Anopheles albimanus (taxon 7167)

## Full-text entities

- **Diseases:** malaria (MESH:D008288)
- **Species:** Anopheles albimanus (species) [taxon 7167]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12134381/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12134381/full.md

---
Source: https://tomesphere.com/paper/PMC12134381