# Improving Protein Quantification with SERS Superspectra and Machine Learning

**Authors:** Jiaheng Cui, Chenyao Feng, Xulan Chen, Yanjun Yang, Pengju Yin, Yiping Zhao

PMC · DOI: 10.1021/acsomega.6c00157 · ACS Omega · 2026-02-04

## TL;DR

This paper introduces a new method using SERS and machine learning to improve protein quantification by combining data from multiple surface chemistries.

## Contribution

The study establishes design principles for constructing effective SERS superspectra and demonstrates improved protein quantification using complementary surface chemistries.

## Key findings

- Superspectra from complementary surface chemistries (e.g., CM&CN) improve quantitative predictions.
- Random forest regression outperforms support vector regression in integrating chemically diverse spectral data.
- Including all four surfaces in superspectra often reduces accuracy due to noninformative features.

## Abstract

Quantitative protein analysis by surface-enhanced Raman
spectroscopy
(SERS) remains challenging due to weak and heterogeneous protein adsorption
on plasmonic surfaces. Here, we introduce a superspectra-guided SERS
framework that leverages chemically distinct interaction environments
to enhance quantitative performance. Silver nanorod (AgNR) substrates
were functionalized with cysteamine (CM), cysteine (CN), and 6-mercapto-1-hexanol
(MCH), together with unmodified (B) AgNRs, to create surfaces that
probe complementary aspects of protein–surface interactions
through charge- and chemistry-dependent binding. Using bovine serum
albumin (BSA) as a model protein, we systematically constructed superspectra
by concatenating SERS signals from all single-, pairwise-, triple-,
and four-surface combinations and evaluated their performance using
support vector regression (SVR) and random forest regression (RFR).
Our results reveal that superspectra must be constructed selectively:
single-substrate spectra lack sufficient chemical diversity, and superspectra
incorporating all four surfaces often degrade accuracy due to noninformative
or conflicting features, particularly those introduced by MCH. In
contrast, superspectra derived from complementary surface chemistries,
especially the CM&CN pair or the B&CM&CN triplet, yield
markedly improved quantitative predictions. RFR consistently outperformed
SVR, demonstrating superior robustness for integrating chemically
heterogeneous spectral inputs. This work establishes, for the first
time, design principles for constructing effective superspectra for
protein SERS and highlights the importance of analyte–surface
interaction complementarity in enabling accurate, scalable protein
quantification.

## Linked entities

- **Chemicals:** cysteamine (PubChem CID 6058), cysteine (PubChem CID 594), 6-mercapto-1-hexanol (PubChem CID 560126)

## Full-text entities

- **Genes:** CRP (C-reactive protein) [NCBI Gene 527553], ALB (albumin) [NCBI Gene 280717]
- **Chemicals:** lysine (MESH:D008239), PDMS (MESH:C013830), lipids (MESH:D008055), CN (MESH:D003545), amino acids (MESH:D000596), titanium (MESH:D014025), Thiol (MESH:D013438), Asp (MESH:D001224), arginine (MESH:D001120), Phe (MESH:D010649), amine (MESH:D000588), AgNR (-), SA (MESH:D000077145), Glu (MESH:D018698), 6-mercapto-1-hexanol (MESH:C503488), Ag (MESH:D012834), B (MESH:D001895), water (MESH:D014867), Tyr (MESH:D014443), CM (MESH:D003543), amide (MESH:D000577), N (MESH:D009584), C (MESH:D002244), COO (MESH:C041069), NH3 (MESH:D000641), salts (MESH:D012492)
- **Species:** Bos taurus (bovine, species) [taxon 9913]
- **Mutations:** Tyr, 870 C-C, 1011 C-C

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12917838/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12917838/full.md

## References

57 references — full list in the complete paper: https://tomesphere.com/paper/PMC12917838/full.md

---
Source: https://tomesphere.com/paper/PMC12917838