What Makes a Representation Good for Single-Cell Perturbation Prediction?

Wenkang Jiang; Yuhang Liu; Yichao Cai; Erdun Gao; Jiayi Dong; Ehsan Abbasnejad; Lina Yao; Javen Qinfeng Shi

arXiv:2605.19343·cs.LG·May 20, 2026

What Makes a Representation Good for Single-Cell Perturbation Prediction?

Wenkang Jiang, Yuhang Liu, Yichao Cai, Erdun Gao, Jiayi Dong, Ehsan Abbasnejad, Lina Yao, Javen Qinfeng Shi

PDF

TL;DR

PerturbedVAE is a novel framework that explicitly disentangles perturbation-specific signals from invariant gene expression, improving prediction accuracy and interpretability in single-cell perturbation modeling.

Contribution

It introduces PerturbedVAE, a method that separates perturbation-specific information from invariant structure and provides an identifiability analysis for sparse effect recovery.

Findings

01

Achieves state-of-the-art performance on benchmark datasets.

02

Significantly improves out-of-distribution prediction accuracy.

03

Uncovers interpretable perturbation-response programs.

Abstract

Single-cell perturbation modeling is fundamental for understanding and predicting cellular responses to genetic perturbations. However, existing approaches, from causal representation learning to foundation models, often struggle with an overlooked challenge: gene expression is dominated by perturbation-invariant information, while perturbation-specific signals are intrinsically sparse. As a result, learned representations either entangle invariant and perturbation-specific information, leading to spurious and non-generalizable predictors, or suppress perturbation-specific signals altogether, rendering them ineffective for prediction. To address this, we propose PerturbedVAE, a general framework designed to resolve this signal imbalance. The framework explicitly separates perturbation-specific information from dominant invariant structure and recovers causal representations to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.