StateXDiff: Cell State-Contextualized Multimodal Diffusion for Single-Cell Perturbation Prediction
Peiting Shi, Ningfeng Que, Xianzhe Huang, Xiaofei Wang, and Jianzhong Jeff Xi

TL;DR
StateXDiff is a novel multimodal diffusion framework that predicts single-cell drug responses by integrating transcriptomic and protein data, improving generalization under challenging conditions.
Contribution
It introduces a cell state-contextualized multimodal diffusion model with disentangled representations and mechanism-aware drug templates for better prediction accuracy.
Findings
Outperforms existing models in unseen cell line predictions.
Effectively models combinatorial drug perturbations.
Enhances generalization to out-of-distribution conditions.
Abstract
Predicting drug-induced cellular state changes at single-cell resolution remains a central challenge in virtual cell modeling, particularly under out-of-distribution (OOD) conditions. Current approaches predominantly rely on RNA-based assays, which often fail to adequately capture the diverse cellular states underlying drug responses. Moreover, conditional distribution shifts and low signal-to-noise ratios frequently cause models to learn spurious correlations rather than genuine state transitions. To address these limitations, we introduce StateXDiff, a cell State-contextualized multimodal (X) Diffusion framework for predicting single-cell responses to drug perturbations. The framework operates sequentially: first, it learns a disentangled, multimodal representation of cellular state by integrating transcriptomic profiles with inferred protein features; second, it employs a conditional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
