# Machine Learning-Driven Prediction of Spatiotemporal Dynamics of Active Nuclei During Drosophila Embryogenesis

**Authors:** Parisa Boodaghi Malidarreh, Priyanshi Borad, Biraaj Rout, Anna Makridou, Shiva Abbasi, Mohammad Sadegh Nasr, Jillur Rahman Saurav, Kelli D. Fenelon, Jai Prakash Veerla, Jacob M. Luber, Theodora Koromila

PMC · DOI: 10.3390/ijms262110338 · 2025-10-23

## TL;DR

This paper uses machine learning to predict gene expression patterns in fruit fly embryos, showing how spatial context influences gene regulation.

## Contribution

The study introduces a machine learning pipeline that accurately predicts spatiotemporal gene expression dynamics during Drosophila embryogenesis.

## Key findings

- The model accurately classifies and predicts active nuclei, particularly along the dorsal–ventral axis during nuclear cycle 14.
- Bootstrap analysis shows higher prediction accuracy in central regions compared to edges of gene expression domains.
- The model captures context-dependent regulatory roles of transcription factors like Su(H), functioning as activators or repressors depending on location.

## Abstract

In this study, we apply machine learning to model the spatiotemporal dynamics of gene expression during early Drosophila embryogenesis. By optimizing model architecture, feature selection, and spatial grid resolution, we developed a predictive pipeline capable of accurately classifying active nuclei and forecasting their future distribution in time. We evaluated the model on two reporter constructs for the short gastrulation (sog) gene, sogD and sogD_∆Su(H), allowing us to assess its performance across distinct genetic contexts. The model achieved high accuracy on the wild-type sogD dataset, particularly along the dorsal–ventral (DV) axis during nuclear cycle 14 (NC14), and accurately predicted expression in the central regions of both wild-type and Suppressor of Hairless (Su(H)) mutant enhancers, sogD_∆Su(H). Bootstrap analysis confirmed that the model performed better in the central region than at the edges, where prediction accuracy dropped. Our previous work showed that Su(H) can act both as a repressor at the borders and as a stabilizer of transcriptional bursts in the center of the sog expression domain. This dual function is not unique to Su(H); other broadly expressed transcription factors also exhibit context-dependent regulatory roles, functioning as activators in some regions and repressors in others. These results highlight the importance of spatial context in transcriptional regulation and demonstrate the ability of machine learning to capture such nuanced behavior. Looking ahead, incorporating mechanistic features such as transcriptional bursting parameters into predictive models could enable simulations that forecast not just where genes are expressed but also how their dynamics unfold over time. This form of in silico enhancer mutagenesis would make it possible to predict the effects of specific binding site changes on both spatial expression patterns and underlying transcriptional activity, offering a powerful framework for studying cis-regulatory logic and modeling early developmental processes across diverse genetic backgrounds.

## Linked entities

- **Genes:** sog (short gastrulation) [NCBI Gene 32498], RBPJ (recombination signal binding protein for immunoglobulin kappa J region) [NCBI Gene 3516]
- **Species:** Drosophila (taxon 7215)

## Full-text entities

- **Species:** Drosophila melanogaster (fruit fly, species) [taxon 7227]

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12610212/full.md

---
Source: https://tomesphere.com/paper/PMC12610212