# Machine learning prediction and calibration of cellulose-based solid-phase extraction performance for pharmaceuticals across aqueous matrices

**Authors:** Ephriam Akor, Damilare Olorunnisola, Moses O. Alfred, Onome Ejeromedoghene, Martins O. Omorogie

PMC · DOI: 10.1039/d5ra09776b · RSC Advances · 2026-03-04

## TL;DR

This paper uses machine learning to predict and calibrate the performance of cellulose-based solid-phase extraction for pharmaceuticals in different water matrices.

## Contribution

The study introduces a machine learning framework to predict and calibrate solid-phase extraction performance across diverse water matrices.

## Key findings

- ElasticNet outperformed other models in predicting sensitivity metrics like enrichment factor and detection limits.
- Random forest showed weak predictive power for recovery and matrix recovery ratio despite higher correlation.
- Decision maps were developed to guide method validation and transfer based on matrix descriptors.

## Abstract

Cellulose-based solid-phase extraction has been increasingly proposed for concentrating trace pharmaceuticals from complex waters; however, cross-laboratory transfer remains uncertain because studies vary in matrix chemistry, sorbent functionalization, extraction format, elution strategy, and quality control. Evidence from 2015 to 2025 was gathered, and 637 experiments from 36 reports and 28 DOIs were modelled using 29 descriptors of method and matrix. ElasticNet (EN), XGBoost (XGB), and random forest regressor (RFR) were evaluated using study group nested cross-validation with conformal prediction to estimate out-of-study performance and 90% confidence intervals for recovery, matrix recovery ratio (MRR), enrichment factor (EF), limit of detection (LOD), and limit of quantification (LOQ). ElasticNet dominated the sensitivity endpoints, achieving a mean R2 of 0.99999 for the enrichment factor, 0.99985 for the limit of detection, and 0.99914 for the limit of quantification, with mean 90% interval widths of 0.300, 44.386, and 829.752, respectively. For the recovery and matrix recovery ratio, random forest has the strongest correlation but remained weakly predictive, with top settings yielding a mean R2 of about −0.52 and MAE of about 15.53 for the recovery and a mean R2 of about −1.03 and MAE of about 21.39 for the matrix recovery ratio, with 90% confidence intervals of 0.651, most pronounced for wastewater and river matrices. Decision maps were used to translate these contrasts into operating guidance and reporting priorities for matrix descriptors needed to support defensible local validation and method transfer.

Cellulose-based solid-phase extraction has been proposed for concentrating trace pharmaceuticals from complex waters; however, cross-laboratory transfer remains uncertain because studies vary in matrix chemistry, sorbent functionalization, extraction format and elution strategy.

## Full-text entities

- **Chemicals:** Cellulose (MESH:D002482), hydrogen (MESH:D006859), heavy metal (MESH:D019216), spike (MESH:C010346), DOC (-), water (MESH:D014867), carbon (MESH:D002244), salts (MESH:D012492)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12958899/full.md

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12958899/full.md

## References

95 references — full list in the complete paper: https://tomesphere.com/paper/PMC12958899/full.md

---
Source: https://tomesphere.com/paper/PMC12958899