# Diagnosis of non-puerperal mastitis based on “whole tongue” features: non-invasive biomarker mining and diagnostic model construction

**Authors:** Siyuan Tu, Yulian Yin, Lina Ma, Hongfeng Chen, Meina Ye

PMC · DOI: 10.3389/fcimb.2025.1602883 · Frontiers in Cellular and Infection Microbiology · 2025-07-28

## TL;DR

This study uses tongue images and microbiota data to create a non-invasive diagnostic model for non-puerperal mastitis with high accuracy.

## Contribution

A novel non-invasive diagnostic framework for NPM combining tongue image analysis and microbiota profiling with machine learning.

## Key findings

- The GBDT model achieved high diagnostic accuracy (AUROC = 0.98, accuracy = 0.95).
- Combining clinical, image, and microbiota features improved model performance over single-feature models.
- Key predictors included Campylobacter, waist–hip ratio, and Alloprevotella.

## Abstract

Non-puerperal mastitis (NPM) arises from heterogeneous factors ranging from autoimmune dysregulation to occult infections. To establish a diagnosis, biopsy is reliable but invasive. Imaging exhibits a limited specificity and may cause diagnostic delays, patient discomfort, and suboptimal management. Inspired by non-invasive tongue diagnosis in traditional Chinese medicine, this study integrated tongue-coating microbiota profiling and AI-quantified tongue image phenotyping to establish an objective, non-invasive diagnostic framework for NPM.

A total of 100 NPM patients from the Breast Surgery Department of Longhua Hospital and 100 healthy volunteers were included. Their clinical characteristics, tongue images, and tongue-coating microbiota data were collected. Features of tongue images (detection, segmentation, and classification) were quantitated and extracted via deep learning. The microbiota composition was assessed using 16S rRNA gene sequencing (V3–V4 region) and bioinformatic pipelines (QIIME2, DADA2). Based on clinical, imaging, and microbial features, three machine learning models—logistic regression (LR), support vector machine (SVM), and gradient boosting decision tree (GBDT)—were trained to distinguish NPM.

The GBDT model achieved a superior diagnostic performance (AUROC = 0.98, accuracy = 0.95, and specificity = 0.95), outperforming the LR (AUROC = 0.98, accuracy = 0.95, and specificity = 0.90) and SVM models (AUROC = 0.87, accuracy = 0.80, and specificity = 0.75). Integration of clinical characteristics, tongue image features, and bacterial profiles (at the genus/family level) yielded the highest accuracy, whereas models using a single class of features showed a lower discriminatory ability (AUROC = 0.90–0.91). Key predictors included Campylobacter (12%), waist–hip ratio (11%), and Alloprevotella (6%).

Integrating clinical characteristics, tongue image features, and tongue-coating microbiota profiles, the multimodal GBDT model demonstrates a high diagnostic accuracy, supporting its utility for early screening and diagnosis of NPM.

## Full-text entities

- **Diseases:** NPM (MESH:D008413), autoimmune dysregulation (MESH:C580192), infections (MESH:D007239)
- **Species:** Homo sapiens (human, species) [taxon 9606], Campylobacter (genus) [taxon 194], Alloprevotella (genus) [taxon 1283313]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12336138/full.md

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12336138/full.md

## References

80 references — full list in the complete paper: https://tomesphere.com/paper/PMC12336138/full.md

---
Source: https://tomesphere.com/paper/PMC12336138