# A recurrent neural network for soft sensor development using CHO stable pools in fed‐batch process for SARS‐CoV‐2 spike protein production as a vaccine antigen

**Authors:** Sebastian‐Juan Reyes, Robert Voyer, Yves Durocher, Olivier Henry, Phuong Lan Pham

PMC · DOI: 10.1002/btpr.70046 · Biotechnology Progress · 2025-06-02

## TL;DR

This paper introduces a soft sensor using a recurrent neural network to predict key production metrics in a long-term bioreactor process for making a SARS-CoV-2 vaccine antigen.

## Contribution

A novel soft sensor based on a recurrent neural network is developed to predict bioreactor metrics in real-time for SARS-CoV-2 spike protein production.

## Key findings

- The model accurately predicted product titer, cell growth, and metabolite levels with low nRMSE and nMAE.
- The model achieved high linearity with ground data (average R² = 0.97) and could predict without offline sampling data.
- Specific glucose consumption rates were predicted accurately, enabling potential online glucose control.

## Abstract

Fed‐batch recombinant therapeutic protein (RTP) production processes utilizing Chinese Hamster Ovary (CHO) cells can take a long period of time (>10 days). Within this period, not all critical features may be measured routinely, and in fact, some are only measured once the process is terminated, complicating decision making. As a consequence, utilizing routine current day bioreactor online data to aid in next day predictions is a promising strategy for model predictive control‐based feeding strategies. The article details the development of a proposed soft sensor that merges current day bioreactor online data and offline historical sampling data to generate predictions about the next day of the production process. This approach demonstrated the ability to track product titer, cell growth, key metabolites, and cumulative glucose consumption across the 17‐day process with low normalized root mean squared error (nRMSE = 0.24) and low normalized mean absolute error (nMAE = 0.18) as well as high linearity with respect to ground data (average R2 = 0.97). It was also demonstrated that the same model architecture could effectively soft sense product titer and metabolic profiles (glucose, lactate, ammonia) without having sampling day's offline data as inputs to the model. This suggests that the proposed model could act as a true soft sensor of hard‐to‐determine variables such as the trimeric SARS‐CoV‐2 spike protein that relies on end‐of‐process measurements to acquire the data (labor‐intensive semi‐quantitative SDS‐PAGE gels or ELISA assay). Instantaneous specific glucose consumption rates were also predicted and showed good agreement with experimental measurements, further offering opportunities for online glucose control.

## Linked entities

- **Chemicals:** glucose (PubChem CID 5793), lactate (PubChem CID 61503), ammonia (PubChem CID 222)
- **Diseases:** SARS-CoV-2 (MONDO:0100096)

## Full-text entities

- **Genes:** S (surface glycoprotein) [NCBI Gene 43740568] {aka spike glycoprotein}
- **Chemicals:** glucose (MESH:D005947), lactate (MESH:D019344), RTP (-), SDS (MESH:D012967), ammonia (MESH:D000641)
- **Species:** Severe acute respiratory syndrome coronavirus 2 (no rank) [taxon 2697049]
- **Cell lines:** CHO — Cricetulus griseus (Chinese hamster), Spontaneously immortalized cell line (CVCL_0213)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12531939/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12531939/full.md

## References

90 references — full list in the complete paper: https://tomesphere.com/paper/PMC12531939/full.md

---
Source: https://tomesphere.com/paper/PMC12531939