# Pareto optimality reveals an atlas of cellular archetypes

**Authors:** George Crowley, Uri Alon, Stephen R. Quake

PMC · DOI: 10.1073/pnas.2530194123 · Proceedings of the National Academy of Sciences of the United States of America · 2026-03-09

## TL;DR

This paper shows that cell type variation follows mathematical trade-offs, revealing natural cell archetypes and functions without prior biological knowledge.

## Contribution

The paper introduces a novel framework using Pareto optimality to define cell types and their functions in an unbiased way.

## Key findings

- Phenotypic variability in cell types is shaped by Pareto optimality, with gene expression lying on low-dimensional polytopes.
- Most cell types in the Tabula Sapiens Atlas adhere to this Pareto-based model, revealing their functions without prior biological assumptions.
- This approach enables explicit representations of transcriptomic manifolds, informing future virtual cell language models.

## Abstract

Creating a first-principles molecular definition of cell type has been a challenging problem. We found that phenotypic variability within cell types is shaped by Pareto optimality, and therefore gene expression lies on low-dimensional polytopes (lines, triangles, tetrahedra, etc.). This approach provides a natural and unbiased definition of cellular archetypes and their functions without the need for prior biological knowledge.

We sought to identify universal organizing principles behind phenotypic variation within cell types. Pareto optimality describes how trade-offs between optimal solutions account for variation, predicting that the boundary points of a data distribution reflect specialized functions. We hypothesized that transcriptomic variation was explained by Pareto optimality across all cell types. We then used the Tabula Sapiens Atlas of single-cell RNA sequencing across cell types and tissues in the human body to test this hypothesis and found that most cell types adhere to this theory. This enabled us to use this principled method to characterize the functions performed by each cell type. These phenotypes are derived from an unbiased approach and do not incorporate ideas from existing biological models or theories, and yet in many cases they recapitulate our understanding of the functions of major cell types. Ultimately, we conclude that multiobjective optimization broadly shapes the observed phenotypic variation within cell types. This finding enables us to write explicit representations of the low-dimensional manifolds on which transcriptomes of single cells reside. This can inform the design of the next generation of virtual cell language models, which aim to statistically learn low-dimensional transcriptomic manifolds.

## Linked entities

- **Species:** Homo sapiens (taxon 9606)

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12993957/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12993957/full.md

## References

85 references — full list in the complete paper: https://tomesphere.com/paper/PMC12993957/full.md

---
Source: https://tomesphere.com/paper/PMC12993957