# Fluorescence Guided Raman Spectroscopy enables the training of robust support vector machines for the detection of tumour marker proteins

**Authors:** Johannes Reifenrath, Benjamin Gardner, Alexander Gigler, Friederike Liesche-Starnecker, Suzy Eldershaw, Nick Stone, Jürgen Schlegel

PMC · DOI: 10.1038/s41598-025-08425-0 · 2025-07-03

## TL;DR

This paper introduces a new method called Fluorescence Guided Raman Spectroscopy to detect tumor marker proteins using Raman shifts and machine learning.

## Contribution

The novel contribution is Fluorescence Guided Raman Spectroscopy, which isolates protein spectral signatures and trains robust classifiers across cell lines.

## Key findings

- mTagBFP2 was identified as the most Raman-compatible fluorophore for use with a 532 nm Raman system.
- Support vector machines trained on Raman data achieved up to 79% accuracy in classifying cells based on connexin 43 content.
- Key Raman shifts correlated with changes in protein structure, as predicted by I-TASSER models.

## Abstract

Raman spectroscopy provides comprehensive biochemical information on a sample’s composition, yet it is often used to analyze aggregated spectra rather than specific shifts. We introduce Fluorescence Guided Raman Spectroscopy (FGRS) as a methodology enabling the isolation of proteins’ spectral signatures and the training of classifiers that generalize across cell lines. We demonstrate the utility of this approach using connexin 43, a marker protein of glioblastoma tumour microtubes. By screening eGFP, sodium fluorescein, and mTagBFP2 for their compatibility with a Raman system operating at 532 nm, we selected mTagBFP2 as the most Raman-compatible fluorophore, whereas the other fluorophores emitting near 532 nm caused spectral interference. mTagBFP2 was cloned into a connexin 43 expression vector, allowing fluorescent tracking and Raman interrogation with subsequent peak identification and correlation to an I-TASSER protein prediction model. We then trained two support vector machines (SVMs) for the classification of cells based on their connexin 43 content and highlighted the impact of different spectral ranges (full spectrum vs. most significant Raman shifts) on specificity and sensitivity in glioblastoma target cell lines. Connexin 43 expression led to a loss of the peaks at 600, 1253, and 1401 cm⁻¹, consistent with an increased α-helical content as predicted by I-TASSER. SVMs achieved up to 79% accuracy on unseen glioblastoma lines, with full-spectrum models reaching 98.7% sensitivity. Thus, FGRS enables the spectral isolation of tumour marker proteins and the development of robust classifiers across cell lines. By focusing on key Raman shifts, this method holds the potential to improve diagnostic accuracy and sensitivity, offering a customizable tool for tumour detection.

The online version contains supplementary material available at 10.1038/s41598-025-08425-0.

## Linked entities

- **Genes:** CONNEXIN 43 (CONNEXIN 43 protein) [NCBI Gene 443455]
- **Proteins:** CONNEXIN 43 (CONNEXIN 43 protein)
- **Chemicals:** sodium fluorescein (PubChem CID 10608)
- **Diseases:** glioblastoma (MONDO:0018177)

## Full-text entities

- **Genes:** GJA1 (gap junction protein alpha 1) [NCBI Gene 2697] {aka AVSD3, CMDR, CX43, EKVP, EKVP3, GJAL}
- **Diseases:** tumour (MESH:D009369), glioblastoma (MESH:D005909)
- **Chemicals:** fluorophore (-), sodium fluorescein (MESH:D019793)

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12229507/full.md

---
Source: https://tomesphere.com/paper/PMC12229507