# biotextgraph: graphical summarization of functional similarities from textual information

**Authors:** Noriaki Sato, Yao-zhong Zhang, Zuguang Gu, Seiya Imoto

PMC · DOI: 10.1093/bioinformatics/btae357 · Bioinformatics · 2024-06-08

## TL;DR

The biotextgraph R package helps visualize and analyze the functional similarities of biological entities using textual data from public databases.

## Contribution

A new R package, biotextgraph, for graphical summarization of textual information to assess functional similarities of biological entities.

## Key findings

- Visualization of textual data reveals biologically meaningful terms not found in pathway databases alone.
- The package is useful for routine analysis of omics-related data and complements enrichment analysis.
- A web-based application is provided for convenient querying and analysis.

## Abstract

Functional interpretation of biological entities such as differentially expressed genes is one of the fundamental analyses in bioinformatics. The task can be addressed by using biological pathway databases with enrichment analysis (EA). However, textual description of biological entities in public databases is less explored and integrated in existing tools and it has a potential to reveal new mechanisms. Here, we present a new R package biotextgraph for graphical summarization of omics’ textual description data which enables assessment of functional similarities of the lists of biological entities. We illustrate application examples of annotating gene identifiers in addition to EA. The results suggest that the visualization based on words and inspection of biological entities with text can reveal a set of biologically meaningful terms that could not be obtained by using biological pathway databases alone. The results suggest the usefulness of the package in the routine analysis of omics-related data. The package also offers a web-based application for convenient querying.

The package, documentation, and web server are available at: https://github.com/noriakis/biotextgraph.

## Full-text entities

- **Diseases:** bladder cancer (MESH:D001749), infection (MESH:D007239), Crohn's disease (MESH:D003424)
- **Chemicals:** sialic acids (MESH:D012794)
- **Species:** Homo sapiens (human, species) [taxon 9606], Betapolyomavirus hominis (species) [taxon 1891762]
- **Cell lines:** RPTEC — Homo sapiens (Human), Telomerase immortalized cell line (CVCL_K278), S2 — Drosophila melanogaster (Fruit fly), Spontaneously immortalized cell line (CVCL_Z232)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11198732/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/PMC11198732/full.md

## References

34 references — full list in the complete paper: https://tomesphere.com/paper/PMC11198732/full.md

---
Source: https://tomesphere.com/paper/PMC11198732