# scCNMF: an integrated analysis model for paired single-cell RNA sequencing and assay for transposase-accessible chromatin sequencing data leveraging cell similarity and cis-regulatory potential

**Authors:** Yufei Zhang, Qiongyu Sheng, Huiran Zhan, Yiyuan Guo, Xiaoran Shi, Jing Qi

PMC · DOI: 10.7717/peerj.20836 · 2026-03-02

## TL;DR

This paper introduces scCNMF, a new method for combining single-cell RNA and chromatin accessibility data to better understand cell states and gene regulation.

## Contribution

scCNMF is a novel model that integrates scRNA-seq and scATAC-seq data by leveraging cell similarity and regulatory information.

## Key findings

- scCNMF improves cell embeddings and clustering accuracy compared to existing methods.
- The model enables biomarker identification and enhances the quality of scATAC-seq data.
- scCNMF demonstrates competitive performance on real-world datasets.

## Abstract

The integrated analysis of paired single-cell RNA sequencing (scRNA-seq) and single-cell Assay for Transposase-Accessible Chromatin using sequencing (scATAC-seq) data is crucial for accurately characterizing cellular states and reconstructing gene regulatory networks. However, most integration methods fail to simultaneously consider the high sparsity of scATAC-seq data and regulatory interactions at the cellular level, limiting the biological interpretability and accuracy of their integration results. In this study, we present scCNMF, a novel model for the integrated analysis of paired scRNA-seq and scATAC-seq data. scCNMF based on the non-negative matrix factorization model, jointly incorporates cell similarity structures and prior regulatory information, leading to improved cell embeddings and enhanced clustering accuracy. We evaluate scCNMF on multiple real-world datasets and demonstrate that it achieves competitive performance compared to state-of-the-art methods. Moreover, further analyses show that scCNMF enables biomarker identification, highlighting its interpretability. Additionally, scCNMF facilitates signal enhancement of scATAC-seq data, resulting in improved data quality for subsequent analyses.

## Full-text entities

- **Genes:** CRP (C-reactive protein) [NCBI Gene 1401] {aka PTX1}, Nr1i3 (nuclear receptor subfamily 1, group I, member 3) [NCBI Gene 12355] {aka CAR, CAR-beta, Care2, ESTM32, MB67}
- **Diseases:** UMAP (MESH:C567162), ARI (MESH:D000275)
- **Chemicals:** scATAC (-)
- **Species:** Mus musculus (house mouse, species) [taxon 10090], Homo sapiens (human, species) [taxon 9606]
- **Cell lines:** S2 — Drosophila melanogaster (Fruit fly), Spontaneously immortalized cell line (CVCL_Z232)

## Figures

38 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12962131/full.md

---
Source: https://tomesphere.com/paper/PMC12962131