# APAV: An advanced pangenome analysis and visualization toolkit

**Authors:** Xiaorui Dong, Du Jiao, Hongzhang Xue, Shiyu Fan, Chaochun Wei

PMC · DOI: 10.1371/journal.pcbi.1013288 · 2025-07-07

## TL;DR

APAV is a new toolkit for detailed pangenome analysis that detects and visualizes genetic variations at a finer scale than traditional gene-level methods.

## Contribution

APAV introduces element-level PAV analysis and interactive visualization for arbitrary genomic regions, improving detection of small but significant variations.

## Key findings

- Element-level analysis in rice genomes identified over 20,000 distributed genes and more than 50,000 genetic elements.
- Tumor genome analysis using APAV revealed three times as many phenotype-related genes compared to gene-level analysis.
- APAV supports interactive reports and subsequent analyses like clustering and genome size estimation based on PAV profiles.

## Abstract

Traditional pangenome analysis focuses on gene presence/absence variations (gene PAVs). However, the current methods for gene PAV analysis are insensitive to detect small but valuable mutations within gene regions, and they overlook variations in intergenic regions. Additionally, the visual inspection of PAVs is an important but time-consuming step for pangenome analysis and result interpretation. To address these issues, we present APAV, an advanced toolkit designed for comprehensive PAV analysis and visualization. It integrates gene element-level PAV analysis and provides PAV analysis for arbitrary given regions in a genome. The resulted PAV profile can be visualized and investigated interactively with reports in HTML format, enabling researchers to conveniently verify sequencing read depth, target region coverage, and intervals of absence for each PAV. Furthermore, APAV offers various subsequent analysis and visualization functions based on the PAV profile table, including basic statistics, sample clustering, genome size estimation, and phenotype association analysis. We demonstrated the capability of APAV with pangenome analysis of tumor genomes and rice genomes. Performing PAV analysis at the element level not only provides more accurate information about the variations but also uncovers a larger number of variations for the phenotype-genotype association studies. In the rice genome analysis, we identified over twenty thousand distributed genes and more than fifty thousand distributed genetic elements. In the tumor genome analysis, element-level analysis revealed approximately three times as many phenotype-related genes as gene-level analysis. This indicates that altering the PAV unit from genes to smaller segments or elements can lead to more biological insights.

A pangenome is the collection of all genetic information of a population. A pangenome can be much more complete than an individual genome. Using a pangenome as the baseline for genomics study is getting more and more prevalent. However, current methods using a pangenome as the reference focus on gene-level presence/absence variations, which will ignore variations in smaller but valuable genomic regions such as exons or gene fragments, which can be a tenth of a gene in size. The tool APAV we present here is designed to analyse and visualize presence/absence variations for genomic regions of any sizes, spanning from genes to any genomic regions of interest. This tool provides a pipeline suitable for analysing eukaryotic linear pangenomes, expands the target region from genes to any genomic regions of interest, and supports element-level analysis. Additionally, it offers a variety of subsequent analysis and visualization methods for PAVs, as well as an interactive report generation function.

## Linked entities

- **Diseases:** tumor (MONDO:0005070)

## Full-text entities

- **Diseases:** tumor (MESH:D009369)
- **Species:** Oryza sativa (Asian cultivated rice, species) [taxon 4530]

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12251200/full.md

---
Source: https://tomesphere.com/paper/PMC12251200