# Inference of marker genes of subtle cell state changes via iLR: iterative logistic regression

**Authors:** Yingtong Liu, Aaron G Baugh, Evanthia T Roussos Torres, Adam L MacLean

PMC · DOI: 10.1093/bioinformatics/btag051 · 2026-02-02

## TL;DR

This paper introduces iLR, a method to identify small sets of marker genes for subtle cell state changes, showing its effectiveness in disease and treatment studies.

## Contribution

iLR uses iterative logistic regression with Pareto front optimization to find minimal yet accurate marker gene sets for cell state differences.

## Key findings

- iLR performs as well as state-of-the-art methods using far fewer genes in single-cell classification.
- iLR identifies disease-relevant genes with high accuracy in distinguishing neuronal subtypes in autism.
- iLR finds informative genes that are consistent across organs and species, including mouse-to-human comparisons.

## Abstract

Differential expression and marker gene selection methods for single-cell RNA-sequencing (scRNA-seq) data can struggle to identify small sets of informative genes, especially for subtle differences between cell states, as can be induced by disease or treatment.

We present iterative logistic regression (iLR) for the identification of small sets of informative marker genes. iLR applied logistic regression iteratively with a Pareto front optimization to balance gene set size with classification performance. Benchmarking iLR on in silico datasets, we demonstrated its comparable performance to the state-of-the-art at single-cell classification using only a fraction of the genes. We then tested iLR on its ability to distinguish neuronal cell subtypes in healthy versus autism spectrum disorder patients and find that it achieves high accuracy with small sets of disease-relevant genes. Applying iLR to investigate immunotherapeutic effects in cell types from different tumor microenvironments, we found that iLR infers informative genes that translate across organs and even species (mouse-to-human) comparison. We predicted via iLR that entinostat acts in part through the modulation of myeloid cell differentiation routes in the lung microenvironment. Overall, iLR provides means to infer interpretable transcriptional signatures from complex datasets with prognostic or therapeutic potential.

iLR is freely available at GitHub https://github.com/maclean-lab/iLR and Zenodo https://zenodo.org/records/17728797.

## Linked entities

- **Chemicals:** entinostat (PubChem CID 4261)
- **Diseases:** autism spectrum disorder (MONDO:0005258)
- **Species:** Mus musculus (taxon 10090), Homo sapiens (taxon 9606)

## Full-text entities

- **Diseases:** autism spectrum disorder (MESH:D000067877), tumor (MESH:D009369)
- **Chemicals:** entinostat (MESH:C118739)
- **Species:** Mus musculus (house mouse, species) [taxon 10090], Homo sapiens (human, species) [taxon 9606]

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12925251/full.md

---
Source: https://tomesphere.com/paper/PMC12925251