Donor-Aware scRNA-seq Benchmarks for IBD Classification

Jonathan Muhire

arXiv:2605.03281·q-bio.QM·May 6, 2026

Donor-Aware scRNA-seq Benchmarks for IBD Classification

Jonathan Muhire

PDF

1 Repo

TL;DR

This paper introduces a donor-aware benchmarking framework for classifying IBD from scRNA-seq data, emphasizing the importance of compartment-aware features for improved accuracy and interpretability.

Contribution

It evaluates three feature representations across two IBD cohorts using donor-aware cross-validation, highlighting the effectiveness of compartment-stratified features and proposing a comprehensive benchmark.

Findings

01

Compartment-stratified CLR composition achieves high AUROC (0.956) in SCP259.

02

GatedStructuralCFN embeddings outperform linear models in the colon region of the Kong cohort.

03

Cross-dataset transfer shows moderate success with AUROC 0.833 when transferring from Crohn's to UC.

Abstract

Donor-level disease classification from single-cell RNA sequencing (scRNA-seq) requires strict donor-aware cross-validation: naive pipelines that split cells randomly conflate training and test donors, inflating reported performance through pseudoreplication. We present a donor-aware benchmark evaluating three feature representations across two independent IBD cohorts: centered log-ratio (CLR) transformed cell-type composition, GatedStructuralCFN dependency embeddings, and scVI variational autoencoder latent embeddings. The cohorts are the SCP259 ulcerative colitis atlas (UC vs. Healthy, n=30 donors, 51 cell types) and the Kong 2023 Crohn's disease atlas (CD vs. Healthy, n=71 donors, 55-68 cell types across three intestinal regions). Compartment-stratified CLR composition achieves AUROC 0.956 +/- 0.061 on SCP259; GatedStructuralCFN on the same features achieves 0.978 +/- 0.050. In the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Jonathan-321/sfn-scrna-study
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.