# Integrating trajectory inference and self-explainable predictive models to explore cell state transitions in breast cancer at single-cell resolution

**Authors:** Vanessa Verrina, Marianna Talia, Eugenio Cesario, Santina Capalbo, Domenica Scordamaglia, Rosamaria Lappano, Anna Maria Miglietta, Marcello Maggiolini, Sabrina Giordano

PMC · DOI: 10.3389/fbinf.2026.1672671 · Frontiers in Bioinformatics · 2026-03-04

## TL;DR

This study uses single-cell RNA sequencing and machine learning to explore cell state transitions in breast cancer, identifying key genes and thresholds linked to tumor progression.

## Contribution

A novel framework combining pseudotime trajectory inference with interpretable machine learning to uncover transparent insights into tumor progression.

## Key findings

- Six distinct cellular clusters were identified, representing both malignant and tumor microenvironment populations.
- Key genes and expression thresholds associated with transcriptional reprogramming were uncovered.
- The framework provides transparent, rule-based insights into dynamic phenotypic transitions during tumor evolution.

## Abstract

Breast cancer is characterized by a highly heterogeneous cellular environment composed of diverse malignant clones and components of the tumor microenvironment (TME) that collectively influence disease progression. Single-cell RNA sequencing (scRNA-seq) offers a powerful tool to dissect this complexity, enabling high-resolution characterization of tumor heterogeneity and functional interactions within the TME. Moreover, it supports the discovery of clinically relevant subpopulations and potential therapeutic targets.

In this study, we present a novel scRNA-seq dataset from an infiltrating ductal breast cancer, profiling over 5,000 cells and identifying six distinct clusters spanning cancer and TME populations. To explore the molecular drivers of cell state transitions, we integrate pseudotime trajectory inference with interpretable, tree-based machine learning. This combined approach enables the identification of key genes and expression thresholds associated with dynamic phenotypic shifts.

Our analysis identified six distinct cellular clusters representing both malignant and TME populations. The integration of pseudotime inference with interpretable machine learning uncovered key genes and specific expression thresholds associated with transcriptional reprogramming and dynamic phenotypic transitions during tumor evolution.

Unlike black-box models, our framework provides transparent, rule-based insights into transcriptional reprogramming processes underlying tumor progression. The resulting dataset, together with an accessible and transparent analytical pipeline, represents a valuable resource for the breast cancer research community and establishes a foundation for future studies aimed at refining molecular classification and advancing precision therapy development.

## Linked entities

- **Diseases:** breast cancer (MONDO:0004989)

## Full-text entities

- **Diseases:** Breast cancer (MESH:D001943), cancer (MESH:D009369), infiltrating (MESH:D017254)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12996216/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12996216/full.md

## References

78 references — full list in the complete paper: https://tomesphere.com/paper/PMC12996216/full.md

---
Source: https://tomesphere.com/paper/PMC12996216