# Counting cells can accurately predict small-molecule bioactivity benchmarks

**Authors:** Srijit Seal, William Dee, Adit Shah, Natacha Cerisier, Andrew Zhang, Esteban Miglietta, Katherine Titterton, Ángel Alexander Cabrera, Daniil Boiko, Alex Beatson, Gregory Slabaugh, Olivier Taboureau, Jordi Carreras Puigvert, Shantanu Singh, Ola Spjuth, Andreas Bender, Anne E. Carpenter

PMC · DOI: 10.1038/s41467-026-68725-5 · Nature Communications · 2026-02-06

## TL;DR

This paper shows that counting cells can predict chemical activity in many assays, making some benchmarks less useful for evaluating advanced methods.

## Contribution

The paper introduces new guidelines and curated benchmarks to better assess the value of advanced profiling methods in bioactivity prediction.

## Key findings

- Many bioactivity assays are well-predicted by cell count alone, not by chemical properties.
- Models using Cell Painting profiles outperformed cell count baselines in curated benchmarks.
- Filtering benchmarks and including cell-count baselines improves evaluation of bioactivity prediction methods.

## Abstract

Accurately predicting the activity of a chemical in each bioactivity assay based on its already known properties is extremely useful in drug development. Unfortunately, we discovered that many assays in widely used assay-activity benchmark datasets directly relate to cell health and cytotoxicity. Many other assays intend to capture a more specific phenotype, but their active compounds impact cell count, while inactives do not. In both cases, counting cells achieves unexpectedly high performance in these benchmarks, making them less useful for discerning whether additional properties, such as phenotypic profiles (mRNA or Cell Painting), provide additional useful information on bioactivity. To accomplish this goal, we recommend filtering benchmarks to exclude such assays and including a cell-count baseline. Using a benchmark with 24 protein-target assays, we confirm that models leveraging Cell Painting image-based profiles outperformed the baseline cell count model. We propose several other practical recommendations for benchmarking machine learning models for predicting bioactivity and assessing the added value of mRNA, protein, or image-based profiles.

Bioactivity benchmarks are key to evaluating and improving methods to predict chemicals’ specific biological activity. Here, the authors find existing benchmarks poorly suit this goal; their assays are often well-predicted simply by counting cells. They propose new guidelines and curated benchmarks.

## Full-text entities

- **Diseases:** cytotoxicity (MESH:D064420)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12988037/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12988037/full.md

## References

19 references — full list in the complete paper: https://tomesphere.com/paper/PMC12988037/full.md

---
Source: https://tomesphere.com/paper/PMC12988037