Disentangling Shared and Target-Enriched Topics via Background-Contrastive Non-negative Matrix Factorization
Yixuan Li, Archer Y. Yang, Yue Li

TL;DR
This paper introduces a scalable, interpretable matrix factorization method that isolates condition-specific biological signals from shared background noise in high-dimensional data, improving analysis of complex biological datasets.
Contribution
The paper presents background contrastive Non-negative Matrix Factorization ( extbackslash model), a novel scalable approach that explicitly separates target-specific signals from shared background in high-dimensional biological data.
Findings
Reveals hidden disease-associated programs in brain single-cell RNA-seq data.
Identifies genotype-linked protein expression patterns in mice.
Detects treatment-specific transcriptional changes in leukemia and drug responses in cancer cells.
Abstract
Biological signals of interest in high-dimensional data are often masked by dominant variation shared across conditions. This variation, arising from baseline biological structure or technical effects, can prevent standard dimensionality reduction methods from resolving condition-specific structure. The challenge is that these confounding topics are often unknown and mixed with biological signals. Existing background correction methods are either unscalable to high dimensions or not interpretable. We introduce background contrastive Non-negative Matrix Factorization (\model), which extracts target-enriched latent topics by jointly factorizing a target dataset and a matched background using shared non-negative bases under a contrastive objective that suppresses background-expressed structure. This approach yields non-negative components that are directly interpretable at the feature…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSingle-cell and spatial transcriptomics · Gene expression and cancer classification · Domain Adaptation and Few-Shot Learning
