# PyEvoCell: an LLM-augmented single-cell trajectory analysis dashboard

**Authors:** Sachin Mathur, Mathieu Beauvais, Arnau Giribet, Nicolas Aragon Barrero, Chaorui-Tom Zhang, Towsif Rahman, Seqian Wang, Jeremy Huang, Nima Nouri, Andre Kurlovs, Ziv Bar-Joseph, Peyman Passban

PMC · DOI: 10.1093/bioinformatics/btaf158 · 2025-04-10

## TL;DR

PyEvoCell is a dashboard that uses large language models to help researchers analyze and interpret single-cell trajectory data more effectively.

## Contribution

PyEvoCell introduces LLM-augmented analysis for trajectory interpretation, including lineage suggestion and hypothesis validation.

## Key findings

- PyEvoCell uses LLMs to suggest biologically relevant lineages from trajectory inference outputs.
- The dashboard supports differential expression and functional analyses with LLM interpretations.
- A veracity filter validates hypotheses using PubMed citations.

## Abstract

Several methods have been developed for trajectory inference in single-cell studies. However, identifying relevant lineages among several cell types and interpreting the results of downstream analysis remains a challenging task that requires deep understanding of various cell type transitions and progression patterns. Therefore, there is a need for methods that can aid researchers in the analysis and interpretation of such trajectories.

We developed PyEvoCell, a dashboard for trajectory interpretation and analysis that is augmented by large language model (LLM) capabilities. PyEvoCell applies the LLM to the outputs of trajectory inference methods such as Monocle3, to suggest biologically relevant lineages. Once a lineage is defined, users can conduct differential expression and functional analyses which are also interpreted by the LLM. Finally, any hypothesis or claim derived from the analysis can be validated using the veracity filter, a feature enabled by the LLM, to confirm or reject claims by providing relevant PubMed citations.

The software is available at https://github.com/Sanofi-Public/PyEvoCell. It contains installation instructions, user manual, demo datasets, as well as license conditions. https://doi.org/10.5281/zenodo.15114803.

## Full-text entities

- **Genes:** NEAT1 (nuclear paraspeckle assembly transcript 1) [NCBI Gene 283131] {aka LINC00084, NCRNA00084, TP53LC15, TncRNA, VINC}, KRAS (KRAS proto-oncogene, GTPase) [NCBI Gene 3845] {aka 'C-K-RAS, C-K-RAS, CFC2, K-RAS2A, K-RAS2B, K-RAS4A}, EEF1A1 (eukaryotic translation elongation factor 1 alpha 1) [NCBI Gene 1915] {aka CCS-3, CCS3, EE1A1, EEF-1, EEF1A, EF-Tu}, TP53TG1 (TP53 target 1) [NCBI Gene 11257] {aka LINC00096, NCRNA00096, P53TG1, P53TG1-D, TP53AP1, TP53LC12}, RPS4X (ribosomal protein S4 X-linked) [NCBI Gene 6191] {aka CCG2, DXS306, RPS4, S4, SCAR, SCR10}
- **Diseases:** lung cancer (MESH:D008175), LLM (MESH:D007806), TI (MESH:D000077962), hallucinations (MESH:D006212)
- **Mutations:** G12C

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/PMC12014098/full.md

---
Source: https://tomesphere.com/paper/PMC12014098