# scMix: learning temporal dynamics of gene expression under irregular time intervals

**Authors:** Shangjin Han, Dongsup Kim

PMC · DOI: 10.1093/bioinformatics/btag080 · Bioinformatics · 2026-02-15

## TL;DR

This paper introduces scMix, a new framework for predicting gene expression over time, even when time points are unevenly spaced.

## Contribution

scMix uses a language model with a novel delta-time mechanism and trend regularization to improve predictions of gene expression over irregular time intervals.

## Key findings

- scMix outperforms existing methods in predicting gene expression at unmeasured time points.
- The delta-time mechanism reduces error accumulation and improves robustness in predictions.
- The model achieves strong results on downstream tasks related to gene expression.

## Abstract

Understanding temporal gene expression is fundamental in the study of cellular development and differentiation. In practice, temporal single-cell datasets tend to contain only a limited number of measured time points, which are often unevenly spaced, resulting in irregular intervals between observations due to experimental constraints. Existing methods typically address these intervals by sequentially predicting one time point after another, yet lack mechanisms to explicitly model time intervals, leading to error accumulation.

In this work, we introduce scMix, a language-model-based framework for predicting single-cell gene expression, which enables prediction from multiple historical time points. We build scMix on the Receptance Weighted Key Value architecture and use its time decay mechanism to model temporal dependencies over time. Moreover, scMix proposes a delta-time mechanism that allows the model to bypass unmeasured time points, reducing error accumulation and improving robustness. In addition, we incorporate a trend regularization strategy to enhance the temporal coherence of predicted gene expression trajectories. scMix demonstrates state-of-the-art performance in predicting gene expression at unmeasured time points, surpassing existing methods, and also achieves outstanding results on downstream tasks.

The code used for this study is available at https://doi.org/10.5281/zenodo.18287184.

Graphical Abstract

## Full-text entities

- **Genes:** tbx16 (T-box transcription factor 16) [NCBI Gene 30264] {aka cb103, id:ibd2185, spt, spt-1, wu:fa17g07, zgc:109817}, top2a (DNA topoisomerase II alpha) [NCBI Gene 323733] {aka fc08c07, wu:fc08c07}, sox3 (SRY-box transcription factor 3) [NCBI Gene 30529] {aka id:ibd2036, sb:cb493, wu:fd02a08, zgc:110279}
- **Chemicals:** PAGA (-)
- **Species:** Mus musculus (house mouse, species) [taxon 10090], Drosophila melanogaster (fruit fly, species) [taxon 7227], Danio rerio (leopard danio, species) [taxon 7955], Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12970592/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12970592/full.md

## References

31 references — full list in the complete paper: https://tomesphere.com/paper/PMC12970592/full.md

---
Source: https://tomesphere.com/paper/PMC12970592