# A biologically constrained agent-based model of cancer stem cell dynamics with reinforcement learning-guided adaptive radiotherapy

**Authors:** Mina Lagzian, S. Ehsan Razavi, Reyhane Kardehi Moghaddam

PMC · DOI: 10.1371/journal.pone.0340426 · PLOS One · 2026-02-05

## TL;DR

This paper introduces a model combining cancer stem cell dynamics with reinforcement learning to explore adaptive radiotherapy strategies.

## Contribution

A novel integration of biologically constrained agent-based modeling and reinforcement learning for adaptive cancer treatment simulation.

## Key findings

- Reinforcement learning can adjust radiation dosage based on real-time CSC localization.
- The model simulates tumor progression with microenvironmental factors and intra-tumoral heterogeneity.
- The approach shows potential for guiding personalized radiotherapy strategies.

## Abstract

Cancer stem cells (CSCs) represent a rare but critical subpopulation within tumors, driving recurrence, resistance to therapy, and aggressive growth. To better understand CSC behavior in solid tumors, we developed a biologically constrained agent-based model (ABM) that simulates tumor progression initiated from a single CSC. The model incorporates essential microenvironmental factors—including oxygen diffusion, spatial limitations, stochastic migration, and cell cycle dynamics—allowing for high-resolution simulation of tumor development and intra tumoral heterogeneity. While this work does not aim to fully optimize therapy for clinical application, it provides a flexible, scalable simulation environment where adaptive treatment strategies can be tested. To extend a biological model toward intelligent treatment, we integrated a reinforcement learning (Q-learning) component that adaptively adjusts radiation dosage based on real-time CSC localization and microenvironmental feedback. This component is currently presented as a proof-of-concept to demonstrate feasibility, and its optimization and convergence analysis will be explored in future studies. Our results suggest that reinforcement learning, when integrated with a biologically grounded ABM, can guide adaptive and more personalized radiotherapy strategies.

## Linked entities

- **Diseases:** cancer (MONDO:0004992)

## Full-text entities

- **Diseases:** Hypoxic (MESH:D002534), Cancer (MESH:D009369), toxicity (MESH:D064420), tumorigenesis (MESH:D063646), death (MESH:D003643), hypoxia (MESH:D000860)
- **Chemicals:** RL (-), Oxygen (MESH:D010100)
- **Species:** Homo sapiens (human, species) [taxon 9606], Mus musculus (house mouse, species) [taxon 10090]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12875451/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12875451/full.md

## References

24 references — full list in the complete paper: https://tomesphere.com/paper/PMC12875451/full.md

---
Source: https://tomesphere.com/paper/PMC12875451