# Machine Learning Modeling for Codonopsis Radix Quality Assessment Integrating Efficacy, Chemical Composition, and Macroscopic Traits

**Authors:** Xingyu Guo, Ziyue Song, Yunqi Sun, Chi Wang, Ruiqi Yang, Yonghong Yan

PMC · DOI: 10.3390/foods15040651 · 2026-02-11

## TL;DR

This study uses machine learning to assess the quality of Codonopsis Radix by combining its effectiveness, chemical makeup, and physical traits.

## Contribution

The novel contribution is an interpretable machine learning framework integrating pharmacological, chemical, and sensory data for quality assessment.

## Key findings

- Electronic nose data combined with machine learning effectively classified Codonopsis Radix samples.
- SHapley Additive exPlanations identified sensors S8, S15, S16, and S18 as key variables for classification.
- Regression models accurately predicted alcohol-soluble extract and polysaccharide contents.

## Abstract

This study aimed to develop an intelligent quality assessment system for Codonopsis Radix based on machine learning modeling. First, Codonopsis Radix samples from six origins were grouped based on pharmacological and chemical indicators, integrating pharmacodynamic evaluations using impaired spleen and lung function animal models with compositional analysis of the alcohol-soluble extract and polysaccharide contents. Subsequently, an electronic nose was employed to objectively quantify their odor profiles. A machine learning-based modeling framework was constructed by integrating feature extraction, feature selection, and pattern recognition techniques. The classification model built by combining electronic nose data with machine learning algorithms demonstrated highly effective discriminatory capability in cross-validation. SHapley Additive exPlanations analysis identified sensors S8, S15, S16, and S18 as critical variables for classification. Concurrently, regression models were established to predict the alcohol-soluble extract and polysaccharide contents. Given the limited sample size, feature expansion and data augmentation strategies were applied exclusively to the training set to enhance model robustness. In summary, the proposed interpretable modeling approach, which integrates pharmacological efficacy, chemical composition, and electronic nose data, provides a referential technical pathway for similar studies.

## Full-text entities

- **Genes:** Gast (gastrin) [NCBI Gene 25320] {aka Gas, PPG34}
- **Diseases:** and lung function (MESH:D055370), Lung deficiency (MESH:D008171), impairment (MESH:D060825), chronic airway inflammation (MESH:D007249), injury to (MESH:D014947), gastric ulcers (MESH:D013276), hypoxemic (MESH:D012131), tachypnea (MESH:D059246), digestive dysfunction (MESH:D004066), bronchial mucosal epithelial hyperplasia (MESH:D017573), lethargy (MESH:D053609), cough (MESH:D003371), gastrointestinal hormone (MESH:D005767), coagulation (MESH:D001778), emaciation (MESH:D004614), impaired lung function (MESH:D003072), Bronchial mucosal injury (MESH:D001982), gastric mucosal injury (MESH:D013272), Impaired spleen function (MESH:D013160)
- **Chemicals:** ketones (MESH:D007659), Polysaccharide (MESH:D011134), Lobetyolin (MESH:C521561), ammonia (MESH:D000641), p (MESH:D010758), ethanol (MESH:D000431), water (MESH:D014867), phenol (MESH:D019800), hydrocarbon (MESH:D006838), acetone (MESH:D000096), amines (MESH:D000588), butane (MESH:C046888), Hematoxylin (MESH:D006416), hydrogen sulfide (MESH:D006862), CR (-), sulfuric acid (MESH:C033158), p(O2) (MESH:C093415), Alcohol (MESH:D000438), Paraformaldehyde (MESH:C003043), propane (MESH:D011407), CO2 (MESH:D002245)
- **Species:** Homo sapiens (human, species) [taxon 9606], Rattus norvegicus (brown rat, species) [taxon 10116]

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12939910/full.md

---
Source: https://tomesphere.com/paper/PMC12939910