# RP3Net: a deep learning model for predicting recombinant protein production in Escherichia coli

**Authors:** Evgeny Tankhilevich, Sergio Martinez Cuesta, Ian Barrett, Carolina Berg, Lovisa Holmberg Schiavone, Andrew R Leach

PMC · DOI: 10.1093/bioinformatics/btag003 · Bioinformatics · 2026-01-11

## TL;DR

RP3Net is a deep learning model that predicts recombinant protein production in E. coli, improving accuracy and helping streamline biotech processes.

## Contribution

RP3Net introduces a novel AI model for predicting recombinant protein expression in E. coli with higher accuracy than existing models.

## Key findings

- RP3Net achieved an AUROC of 0.83 on a prospective test set of 97 constructs.
- The model correctly identified successful protein expression in 92% of cases.
- RP3Net outperformed existing models with an increase in AUROC of 0.15.

## Abstract

Recombinant protein expression can be a limiting step in the production of protein reagents for drug discovery and other biotechnology applications. We introduce RP3Net (Recombinant Protein Production Prediction Network), an AI model of small-scale heterologous soluble protein expression in Escherichia coli. RP3Net utilizes the most recent protein and genomic foundational models. A curated dataset of internal experimental results from AstraZeneca and publicly available data from the Structural Genomics Consortium was used for training, validation and testing of RP3Net.

RP3Net achieves an increase in area under the receiver operator curve (AUROC) of 0.15, compared to a baseline model. When experimentally validated on an independent, prospective, manually selected set of 97 constructs, RP3Net outperformed currently available models, with an AUROC of 0.83, delivering accurate predictions in 77% of the cases, and correctly identifying successfully expressing constructs in 92% of cases.

The model, along with installation and running instructions, is available under an MIT licence at https://github.com/RP3Net/RP3Net, DOI 10.5281/zenodo.17243498.

## Linked entities

- **Species:** Escherichia coli (taxon 562)

## Full-text entities

- **Species:** Escherichia coli (E. coli, species) [taxon 562]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12857573/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12857573/full.md

## References

74 references — full list in the complete paper: https://tomesphere.com/paper/PMC12857573/full.md

---
Source: https://tomesphere.com/paper/PMC12857573