# ProTaxoVis—protein taxonomic visualisation of presence

**Authors:** Yin-Chen Hsieh, Mathias Bockwoldt, Ines Heiland

PMC · DOI: 10.1186/s12859-025-06146-9 · BMC Bioinformatics · 2025-05-19

## TL;DR

ProTaxoVis is a new tool that visualizes protein presence across species, helping researchers study pathway evolution and distribution patterns.

## Contribution

ProTaxoVis introduces a novel approach to visualize protein presence using taxonomic trees and heatmaps for comparative pathway analysis.

## Key findings

- ProTaxoVis generates taxonomic trees with pie-charts showing protein presence patterns.
- The tool creates heatmaps to visualize protein presence and conservation across species.
- ProTaxoVis was evaluated using phosphoribosyltransferases, revealing distinct distribution patterns across domains of life.

## Abstract

Protein presence information is an essential component of biological pathway identification. Presence of certain enzymes in an organism points towards the metabolic pathways that occur within it, whereas the absence of these enzymes indicates either the existence of alternative pathways or a lack of these pathways altogether. The same inference applies to regulatory pathways such as gene regulation and signal transduction. Protein presence information therefore forms the basis for biological pathway studies, and patterns in presence-absence across multiple organisms allow for comparative pathway analyses.

Here we present ProTaxoVis, a novel bioinformatic tool that extracts protein presence information from database queries and maps it to a taxonomic tree or heatmap. ProTaxoVis generates a large-scale overview of presence patterns in taxonomic clades of interest. This overview reveals protein distribution patterns, and this can be used to deduce pathway evolution or to probe other biological questions. ProTaxoVis combines and filters sequence query results to extract information on the distribution of proteins and translates this information into two types of visual outputs: taxonomic trees and heatmaps. The trees supplement their topology with scaled pie-chart representations per node of the presence of target proteins and combinations of these proteins, such that patterns in taxonomic groups can easily be identified. The heatmap visualisation shows presence and conservation of these proteins for a user-determined set of species, allowing for a more detailed view over a larger group of proteins as compared to the trees. ProTaxoVis also allows for visual quality checks of hits based on a coverage plot and a length histogram, which can be used to determine e-value and minimum protein length cutoffs. Tabular output of resulting data from the query, combined, and heatmap building step are saved and easily accessible for further analyses.

We evaluate our tool with the phosphoribosyltransferases, a transferase enzyme family with notable distribution patterns amongst organisms of varying complexities and across Eukaryota, Bacteria, and Archaea. ProTaxoVis is open-source and available at: https://github.com/MolecularBioinformatics/ProTaxoVis.

## Linked entities

- **Species:** Eukaryota (taxon 2759), Bacteria (taxon 2), Archaea (taxon 2157)

## Full-text entities

- **Genes:** APRT (adenine phosphoribosyltransferase) [NCBI Gene 353] {aka AMP, APRTD}, HPRT1 (hypoxanthine phosphoribosyltransferase 1) [NCBI Gene 3251] {aka HGPRT, HPRT}, LYPLA2P1 (LYPLA2 pseudogene 1) [NCBI Gene 653639] {aka APT, LYPLA2L, dJ570F3.6}, NAPRT (nicotinate phosphoribosyltransferase) [NCBI Gene 93100] {aka NAPRT1, PP3856}, MTOR (mechanistic target of rapamycin kinase) [NCBI Gene 2475] {aka FRAP, FRAP1, FRAP2, RAFT1, RAPT1, SKS}, NAMPT (nicotinamide phosphoribosyltransferase) [NCBI Gene 10135] {aka 1110035O14Rik, PBEF, PBEF1, VF, VISFATIN}, STAC3 (SH3 and cysteine rich domain 3) [NCBI Gene 246329] {aka CMYO13, CMYP13, MYPBB, NAM}, UPRT (uracil phosphoribosyltransferase homolog) [NCBI Gene 139596] {aka FUR1, UPP}, QPRT (quinolinate phosphoribosyltransferase) [NCBI Gene 23475] {aka HEL-S-90n, QPRTase}
- **Diseases:** PRTs (MESH:D007926)
- **Chemicals:** hypoxanthine (MESH:D019271), guanosine monophosphate (MESH:D006157), guanine (MESH:D006147), uracil (MESH:D014498), inosine monophosphate (MESH:D007291), Quinolinic acid (MESH:D017378), PRT (-), NA (MESH:D009525), purine (MESH:C030985), adenosine triphosphate ATP (MESH:D000255), adenosine monophosphate (MESH:D000249), NAD (MESH:D009243), uridine monophosphate (MESH:D014542), uridine triphosphate UTP (MESH:D014544)
- **Species:** Escherichia coli (E. coli, species) [taxon 562], Saccharomyces cerevisiae (baker's yeast, species) [taxon 4932], Planctomycetota (phylum) [taxon 203682], Chlorobiota (green sulfur bacteria, phylum) [taxon 1090], Fibrobacteria (class) [taxon 204430], Verrucomicrobiota (phylum) [taxon 74201], Acidobacteriota (phylum) [taxon 57723], Homo sapiens (human, species) [taxon 9606], Amoebozoa (amoebozoans, clade) [taxon 554915], Bacteria Latreille et al. 1825 (Bacteria stick insect, genus) [taxon 629395], Drosophila melanogaster (fruit fly, species) [taxon 7227]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12087122/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12087122/full.md

---
Source: https://tomesphere.com/paper/PMC12087122