# Neighborhood nonnegative matrix factorization identifies patterns and spatially-variable genes in large-scale spatial transcriptomics data

**Authors:** Ragnhild Laursen, Han Chen, Jack Demaray, Karin Pelka, Barbara E. Engelhardt

PMC · DOI: 10.1186/s13059-025-03846-6 · 2026-01-16

## TL;DR

This paper introduces a new method called neighborhood NMF to analyze spatial patterns in large-scale gene expression data from tissues.

## Contribution

The novel contribution is the development of NNMF, a scalable method for identifying overlapping multicellular gene programs in spatial transcriptomics.

## Key findings

- NNMF identifies functionally coherent neighborhoods in heterogeneous cell data.
- It scales to millions of cells and thousands of genes, outperforming hard clustering methods.
- NNMF reveals immunologically relevant gene signatures in colorectal cancer data.

## Abstract

Methods for identifying complex multicellular spatial neighborhoods do not scale to existing spatial transcriptomics data, and often divide tissues into distinct neighborhoods with hard borders. We develop neighborhood NMF (NNMF) that identifies functionally coherent neighborhoods among heterogeneous cells. NNMF scales to thousands of genes and millions of cells, and produces signatures representing overlapping spatially-organized multicellular gene programs, allowing more biologically-complex interpretations than hard clustering methods. In benchmark spatial transcriptomics data with expert labels, versus related methods, NNMF shows excellent performance even on hard clustering tasks. On MERFISH human colorectal cancer data, NNMF identifies immunologically relevant signatures in millions of cells.

The online version contains supplementary material available at 10.1186/s13059-025-03846-6.

## Linked entities

- **Diseases:** colorectal cancer (MONDO:0005575)

## Full-text entities

- **Genes:** Fos (Fos proto-oncogene, AP-1 transcription factor subunit) [NCBI Gene 14281] {aka D12Rfj1, c-fos, cFos}, IDO1 (indoleamine 2,3-dioxygenase 1) [NCBI Gene 3620] {aka IDO, IDO-1, INDO}, IFI27 (interferon alpha inducible protein 27) [NCBI Gene 3429] {aka FAM14D, ISG12, ISG12A, P27}, CD274 (CD274 molecule) [NCBI Gene 29126] {aka ADMIO5, B7-H, B7H1, PD-L1, PDCD1L1, PDCD1LG1}, TAP1 (transporter 1, ATP binding cassette subfamily B member) [NCBI Gene 6890] {aka ABC17, ABCB2, APT1, D6S114E, MHC1D1, PSF-1}, LAMP3 (lysosome associated membrane protein 3) [NCBI Gene 27074] {aka CD208, DC LAMP, DC-LAMP, DCLAMP, LAMP, LAMP-3}, CXCL8 (C-X-C motif chemokine ligand 8) [NCBI Gene 3576] {aka GCP-1, GCP1, IL8, LECT, LUCT, LYNAP}, MMP1 (matrix metallopeptidase 1) [NCBI Gene 4312] {aka CLG}, NFKBIA (NFKB inhibitor alpha) [NCBI Gene 4792] {aka EDAID2, IKBA, MAD-3, NFKBI}, STAT1 (signal transducer and activator of transcription 1) [NCBI Gene 6772] {aka CANDF7, IMD31A, IMD31B, IMD31C, ISGF-3, STAT91}, RGS5 (regulator of G protein signaling 5) [NCBI Gene 8490] {aka MST092, MST106, MST129, MSTP032, MSTP092, MSTP106}, CEBPB (CCAAT enhancer binding protein beta) [NCBI Gene 1051] {aka C/EBP-beta, IL6DBP, NF-IL6, TCF5}, MOBP (myelin associated oligodendrocyte basic protein) [NCBI Gene 4336], ESR1 (estrogen receptor 1) [NCBI Gene 2099] {aka ER, ESR, ESRA, ESTRR, Era, NR3A1}, HLA-B (major histocompatibility complex, class I, B) [NCBI Gene 3106] {aka AS, B-4901, HLAB}, IFNG (interferon gamma) [NCBI Gene 3458] {aka IFG, IFI, IMD69}, IFITM1 (interferon induced transmembrane protein 1) [NCBI Gene 8519] {aka 9-27, CD225, DSPA2a, IFI17, LEU13}, PRLR (prolactin receptor) [NCBI Gene 5618] {aka HPRL, MFAB, RI-PRLR, hPRLrI}, CXCL9 (C-X-C motif chemokine ligand 9) [NCBI Gene 4283] {aka CMK, Humig, MIG, SCYB9, crg-10}, CD24 (CD24 molecule) [NCBI Gene 100133941] {aka CD24A}, HPCAL1 (hippocalcin like 1) [NCBI Gene 3241] {aka BDR1, HLP2, VILIP-3}, F3 (coagulation factor III, tissue factor) [NCBI Gene 2152] {aka CD142, TF, TFA}, SELPLG (selectin P ligand) [NCBI Gene 6404] {aka CD162, CLA, PSGL-1, PSGL1}, SLC18A2 (solute carrier family 18 member A2) [NCBI Gene 6571] {aka PKDYS2, SVAT, SVMT, VAT2, VMAT2}, COX6C (cytochrome c oxidase subunit 6C) [NCBI Gene 1345], FN1 (fibronectin 1) [NCBI Gene 2335] {aka CIG, ED-B, FINC, FN, FNZ, GFND}, IL6 (interleukin 6) [NCBI Gene 3569] {aka BSF-2, BSF2, CDF, HGF, HSF, IFN-beta-2}, PTGS2 (prostaglandin-endoperoxide synthase 2) [NCBI Gene 5743] {aka COX-2, COX2, GRIPGHS, PGG/HS, PGHS-2, PHS-2}, CXCL10 (C-X-C motif chemokine ligand 10) [NCBI Gene 3627] {aka C7, IFI10, INP10, IP-10, SCYB10, crg-2}, ENC1 (ectodermal-neural cortex 1) [NCBI Gene 8507] {aka ENC-1, KLHL35, KLHL37, NRPB, PIG10, TP53I10}, KRT19 (keratin 19) [NCBI Gene 3880] {aka CK19, K19, K1CS}, PGR (progesterone receptor) [NCBI Gene 5241] {aka NR3C3, PR}, IL1B (interleukin 1 beta) [NCBI Gene 3553] {aka IL-1, IL1-BETA, IL1F2, IL1beta}, TAPBP (TAP binding protein) [NCBI Gene 6892] {aka MHC1D3, NGS17, TAPA, TPN, TPSN}, CCL22 (C-C motif chemokine ligand 22) [NCBI Gene 6367] {aka A-152E5.1, ABCD-1, DC/B-CK, MDC, SCYA22, STCP-1}, CXCL16 (C-X-C motif chemokine ligand 16) [NCBI Gene 58191] {aka CXCLG16, SR-PSOX, SRPSOX}, HLA-C (major histocompatibility complex, class I, C) [NCBI Gene 3107] {aka D6S204, HLA-JY3, HLAC, HLC-C, MHC, PSORS1}, CXCL11 (C-X-C motif chemokine ligand 11) [NCBI Gene 6373] {aka H174, I-TAC, IP-9, IP9, SCYB11, SCYB9B}
- **Diseases:** Oncology (MESH:D000072716), CRC (MESH:D015179), inflammation (MESH:D007249), NMF (MESH:C535501), epithelial tumor (MESH:D002277), lung cancer (MESH:D008175), Cancer (MESH:D009369)
- **Chemicals:** NaN (-)
- **Species:** Homo sapiens (human, species) [taxon 9606], Mus musculus (house mouse, species) [taxon 10090]

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12892798/full.md

---
Source: https://tomesphere.com/paper/PMC12892798