# Accelerated detection of Clostridioides difficile sequence type 37 by integrating MALDI-TOF mass spectrometry with artificial neural network

**Authors:** Liqian Wang, Keqing Zhang, Junjie Lao, Guangzhi Du, Xinghan Huang, Jie Wang, Xianjun Wang, Dazhi Jin, Yu Chen

PMC · DOI: 10.1128/spectrum.01728-25 · Microbiology Spectrum · 2025-11-13

## TL;DR

A new method using mass spectrometry and machine learning rapidly identifies a dangerous strain of Clostridioides difficile in about 10 seconds.

## Contribution

An artificial neural network model combining MALDI-TOF MS with machine learning enables rapid and accurate detection of C. difficile ST37.

## Key findings

- The ANN model achieved an area under the ROC curve of 0.96 and precision-recall curve of 0.94 for detecting C. difficile ST37.
- The model can classify C. difficile ST37 within 10 seconds after species-level identification using MALDI-TOF MS.
- The top 15 potential biomarkers for ST37 were identified based on mass-to-charge ratios.

## Abstract

Rapid identification of Clostridioides difficile sequence type 37 (ST37), also known as RT017, is crucial due to its association with severe infections and antibiotic resistance. Existing methodologies are labor-intensive and costly. The modeling approach combining matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) with machine learning offers a promising alternative for fast and cost-effective subtyping. This work gathered 1,155 mass spectra representing 385 distinct clinical C. difficile isolates from multiple regions, including 118 ST37 isolates (30.65%) and 267 non-ST37 isolates (69.35%). An artificial neural network (ANN) model was created using MALDI-TOF MS data, trained on 80% of the data set and validated on the remaining 20%. The constructed ANN model demonstrated exceptional diagnostic precision and reliable generalizability, achieving an area under the receiver operating characteristic curve of 0.96 and an area under the precision–recall curve of 0.94 for detecting C. difficile ST37 in the validation set. Furthermore, we identified the top 15 potential biomarkers of C. difficile ST37 strains with mass-to-charge ratios of 6,729, 12,013, 12,012, 6,731, 7,296, 12,085, 14,716, 7,292, 3,104, 7,293, 16,966, 15,360, 7,259, 18,488, and 14,660 Da. Our model provides a rapid, reliable, and economical option for identifying C. difficile ST37. Once species-level identification is completed using MALDI-TOF MS, our ANN model enables rapid subtype classification of ST37 within approximately 10 seconds, significantly reducing overall turnaround time. This facilitates accurate clinical diagnosis, mitigating the risk of severe C. difficile infections.

C. difficile ST37 (RT017) is a highly virulent strain that often causes severe infections and is frequently resistant to antibiotics such as fluoroquinolones and clindamycin, which are known to promote C. difficile infection. Rapid identification of this strain is essential to ensure timely clinical intervention and effective infection control. Current detection methods rely on lengthy and labor-intensive procedures, delaying treatment decisions. This study introduces a new, rapid identification method combining mass spectrometry with machine learning. The developed artificial neural network can accurately distinguish the ST37 strain in approximately 10 seconds, significantly reducing diagnostic time compared to traditional methods. Implementing this fast, reliable, and economical diagnostic tool in clinical laboratories will enhance patient care by facilitating quicker diagnosis and targeted therapy, thus minimizing the risk of severe complications associated with C. difficile infections.

## Linked entities

- **Species:** Clostridioides difficile (taxon 1496)

## Full-text entities

- **Diseases:** C. difficile infection (MESH:D003015), infection (MESH:D007239)
- **Chemicals:** clindamycin (MESH:D002981), fluoroquinolones (MESH:D024841)
- **Species:** Homo sapiens (human, species) [taxon 9606], Clostridioides difficile (species) [taxon 1496]
- **Cell lines:** ST37 — Homo sapiens (Human), Transformed cell line (CVCL_G005)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12772325/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12772325/full.md

## References

34 references — full list in the complete paper: https://tomesphere.com/paper/PMC12772325/full.md

---
Source: https://tomesphere.com/paper/PMC12772325