# Robust and efficient annotation of cell states through gene signature scoring

**Authors:** Laure Ciernik, Agnieszka Kraft, Florian Barkmann, Josephine Yates, Valentina Boeva

PMC · DOI: 10.1101/gr.280926.125 · Genome Research · 2026-03-01

## TL;DR

This paper introduces a new method for analyzing single-cell RNA sequencing data that improves the accuracy of identifying cell states.

## Contribution

The novel Adjusted Neighborhood Scoring (ANS) algorithm enhances score stability and cross-signature comparability for cell-state annotation.

## Key findings

- Existing gene signature scoring methods like Seurat, SCANPY, UCell, and JASMINE show insufficient performance for robust cell-state annotation.
- Adjusted Neighborhood Scoring (ANS) achieves comparable accuracy to supervised methods while being deterministic and more stable.
- ANS was successfully applied to distinguish cancer-associated fibroblasts from epithelial-to-mesenchymal transition cells.

## Abstract

Gene signature scoring is integral to single-cell RNA sequencing (scRNA-seq) data analysis, particularly for unsupervised cellular state annotation based on maximum signature score values. However, this application requires robust and comparable score distributions across diverse signatures and experimental conditions. Our systematic evaluation of established scoring methodologies—Seurat, SCANPY, UCell, and JASMINE—across nine healthy and cancer scRNA-seq data sets demonstrates their insufficiency in fulfilling this requirement. To address this limitation, we present Adjusted Neighborhood Scoring (ANS), a deterministic algorithm with enhanced control gene selection that significantly improves score stability and cross-signature comparability, achieving cell-state annotation accuracy comparable to supervised methods. We demonstrate the practical utility of ANS by developing and validating a gene signature to differentiate cancer-associated fibroblasts from malignant cells undergoing epithelial-to-mesenchymal transition. Overall, ANS provides a robust and reliable gene signature scoring framework, significantly improving the accuracy of score-based annotation of cell types and states in single-cell studies.

## Linked entities

- **Diseases:** cancer (MONDO:0004992)

## Full-text entities

- **Genes:** NKX3-1 (NK3 homeobox 1) [NCBI Gene 4824] {aka BAPX2, NKX3, NKX3.1, NKX3A}, BRCA1 (BRCA1 DNA repair associated) [NCBI Gene 672] {aka BRCAI, BRCC1, BROVCA1, FANCS, IRIS, PNCA4}, ITGA3 (integrin subunit alpha 3) [NCBI Gene 3675] {aka CD49C, FRP-2, GAP-B3, GAPB3, ILNEB, JEB7}, CD4 (CD4 molecule) [NCBI Gene 920] {aka CD4mut, IMD79, Leu-3, OKT4D, T4}, NEURL1 (neuralized E3 ubiquitin protein ligase 1) [NCBI Gene 9148] {aka NEUR1, NEURL, RNF67, bA416N2.1, neu, neu-1}, BICDL3P (BICD family like 3, pseudogene) [NCBI Gene 171022] {aka ABHD11-AS1, LINC00035, NCRNA00035, WBSCR26}, CD14 (CD14 molecule) [NCBI Gene 929], CD8A (CD8 subunit alpha) [NCBI Gene 925] {aka CD8, CD8alpha, IMD116, Leu2, p32}, BCYRN1 (brain cytoplasmic RNA 1) [NCBI Gene 618] {aka BC200, BC200a, LINC00004, NCRNA00004}, PITX1 (paired like homeodomain 1) [NCBI Gene 5307] {aka BFT, CCF, POTX, PTX1}, MUC16 (mucin 16, cell surface associated) [NCBI Gene 94025] {aka CA125}, NEURL1B (neuralized E3 ubiquitin protein ligase 1B) [NCBI Gene 54492] {aka RNF67B, hNeur2, neur2}, SAA1 (serum amyloid A1) [NCBI Gene 6288] {aka PIG4, SAA, TP53I4}, L1CAM (L1 cell adhesion molecule) [NCBI Gene 3897] {aka CAML1, CD171, HSAS, HSAS1, HYCX, MASA}, SLC6A4 (solute carrier family 6 member 4) [NCBI Gene 6532] {aka 5-HTT, 5-HTTLPR, 5HTT, HTT, OCD1, SERT}, PEMT (phosphatidylethanolamine N-methyltransferase) [NCBI Gene 10400] {aka PEAMT, PEMPT, PEMT2, PLMT}, ITGB4 (integrin subunit beta 4) [NCBI Gene 3691] {aka CD104, GP150, JEB5A, JEB5B}, SACK1A (scaffolding CK1 anchoring protein A) [NCBI Gene 84985] {aka BJ-TSA-9, FAM83A}
- **Diseases:** DLBC (MESH:D016403), kidney renal clear cell carcinoma (MESH:D002292), BRCA tumors (MESH:D001943), HGSOC (MESH:D010051), CRC (MESH:D015179), breast (MESH:D061325), Hematological malignancies (MESH:D019337), SARC (MESH:D012509), acute myeloid leukemia (MESH:D015470), HNSC (MESH:D000077195), cSCC (MESH:D002294), LUAD (MESH:D000077192), epithelial ovarian, prostate cancer (MESH:D000077216), MES (MESH:C536133), CAFs (MESH:D009369), ESCC (MESH:D000077277), melanoma (MESH:D008545), osteosarcoma (MESH:D012516), PAAD (MESH:D010190)
- **Chemicals:** PHATE (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]
- **Cell lines:** LUAD — Homo sapiens (Human), Lung adenocarcinoma, Cancer cell line (CVCL_WN45)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12951948/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12951948/full.md

## References

71 references — full list in the complete paper: https://tomesphere.com/paper/PMC12951948/full.md

---
Source: https://tomesphere.com/paper/PMC12951948