Modeling Gene Expression Distributional Shifts for Unseen Genetic Perturbations

Kalyan Ramakrishnan; Jonathan G. Hedley; Sisi Qu; Puneet K. Dokania; Philip H. S. Torr; Cesar A. Prada-Medina; Julien Fauqueur; Kaspar Martens

arXiv:2507.02980·q-bio.GN·July 8, 2025

Modeling Gene Expression Distributional Shifts for Unseen Genetic Perturbations

Kalyan Ramakrishnan, Jonathan G. Hedley, Sisi Qu, Puneet K. Dokania, Philip H. S. Torr, Cesar A. Prada-Medina, Julien Fauqueur, Kaspar Martens

PDF

TL;DR

This paper introduces a neural network model that predicts gene expression distribution responses to genetic perturbations, capturing higher-order statistics and generalizing to unseen perturbations using gene embeddings from language models.

Contribution

It presents a novel approach that models full gene expression distributions, including higher-order moments, and incorporates prior knowledge for better generalization to unseen perturbations.

Findings

01

Outperforms baselines in capturing variance, skewness, kurtosis.

02

Predicts gene expression distributions at lower training costs.

03

Maintains competitive mean prediction accuracy.

Abstract

We train a neural network to predict distributional responses in gene expression following genetic perturbations. This is an essential task in early-stage drug discovery, where such responses can offer insights into gene function and inform target identification. Existing methods only predict changes in the mean expression, overlooking stochasticity inherent in single-cell data. In contrast, we offer a more realistic view of cellular responses by modeling expression distributions. Our model predicts gene-level histograms conditioned on perturbations and outperforms baselines in capturing higher-order statistics, such as variance, skewness, and kurtosis, at a fraction of the training cost. To generalize to unseen perturbations, we incorporate prior knowledge via gene embeddings from large language models (LLMs). While modeling a richer output space, the method remains competitive in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.