# Representing and extracting knowledge from single-cell data

**Authors:** Ionut Sebastian Mihai, Sarang Chafle, Johan Henriksson

PMC · DOI: 10.1007/s12551-023-01091-4 · Biophysical Reviews · 2023-08-05

## TL;DR

This paper reviews advanced methods for analyzing single-cell data using statistics and machine learning, aiming to improve biological understanding.

## Contribution

The paper introduces novel concepts from topology and generative processes to enhance single-cell data analysis.

## Key findings

- Advanced statistical models can better capture biological complexity in single-cell data.
- Topological and generative approaches offer new ways to analyze single-cell datasets.
- Natural language processing may help overcome cognitive limits in interpreting single-cell data.

## Abstract

Single-cell analysis is currently one of the most high-resolution techniques to study biology. The large complex datasets that have been generated have spurred numerous developments in computational biology, in particular the use of advanced statistics and machine learning. This review attempts to explain the deeper theoretical concepts that underpin current state-of-the-art analysis methods. Single-cell analysis is covered from cell, through instruments, to current and upcoming models. The aim of this review is to spread concepts which are not yet in common use, especially from topology and generative processes, and how new statistical models can be developed to capture more of biology. This opens epistemological questions regarding our ontology and models, and some pointers will be given to how natural language processing (NLP) may help overcome our cognitive limitations for understanding single-cell data.

## Full-text entities

- **Genes:** GATA3 (GATA binding protein 3) [NCBI Gene 2625] {aka HDR, HDRS}, IL4 (interleukin 4) [NCBI Gene 3565] {aka BCGF-1, BCGF1, BSF-1, BSF1, IL-4}, F3 (coagulation factor III, tissue factor) [NCBI Gene 2152] {aka CD142, TF, TFA}, TRBV20OR9-2 (T cell receptor beta variable 20/OR9-2 (non-functional)) [NCBI Gene 6962] {aka CDR3, TCRBV20S2, TCRBV2O, TCRBV2S2O}
- **Diseases:** DR (MESH:D015431), UMIs (MESH:C566733), VAEs (OMIM:610141), ML (MESH:D007859)
- **Chemicals:** poly adenine (MESH:C000628261), calcium (MESH:D002118), TSO (-), lipids (MESH:D008055), polyA (MESH:D011061)
- **Species:** Mus musculus (house mouse, species) [taxon 10090], Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC10937862/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/PMC10937862/full.md

## References

124 references — full list in the complete paper: https://tomesphere.com/paper/PMC10937862/full.md

---
Source: https://tomesphere.com/paper/PMC10937862