# Cultivar Differentiation and Origin Tracing of Panax quinquefolius Using Machine Learning Model-DrivenComparative Metabolomics

**Authors:** Rongrong Zhou, Yikun Wang, Lanping Zhen, Bingbing Shen, Hongping Long, Luqi Huang

PMC · DOI: 10.3390/foods14081340 · 2025-04-14

## TL;DR

This study uses machine learning and metabolomics to distinguish between wild and cultivated American ginseng, enabling accurate origin tracing and quality evaluation.

## Contribution

A novel AI-driven comparative metabolomics strategy is introduced for cultivar differentiation and origin tracing of American ginseng.

## Key findings

- Wild and cultivated American ginseng show distinct metabolic phenotypes.
- Eight differential metabolites in ESI+ mode and three in ESI- mode were identified, including seven ginsenosides.
- A Random Forest model achieved 100% accuracy in classifying wild and cultivated ginseng.

## Abstract

American ginseng (Panax quinquefolius L.) is a rare and valuable plant utilized for medicinal and culinary purposes, with its geographic origin and cultivation significantly affecting its quality and efficacy. However, the metabolic differences between cultivated and wild American ginseng are not well understood. An accurate and reliable method for tracing the origin and evaluating the quality of American ginseng is therefore urgently required. This study introduces a UHPLC-Q/TOF-MS-based comparative metabolomics and machine learning strategy for the rapid identification of wild and cultivated American ginseng. Both principal component analysis and hierarchical cluster analysis revealed distinct metabolic phenotypes between wild and cultivated American ginseng. Furthermore, the integration of univariate and multivariate statistical analyses identified eight differential metabolites in the ESI+ mode and three in the ESI- mode, including seven ginsenosides. A potential ginsenosides marker panel was used to construct five machine learning models to assist in diagnosing the metabolic phenotypes of American ginseng. The Random Forest model, based on the eight differential metabolites in the ESI+ mode, achieved a 100% classification rate in both test and validation sets for distinguishing between wild and cultivated American ginseng. This study highlights the feasibility and application of our artificial intelligence-driven comparative metabolomics strategy for cultivar identification and geographic tracing of American ginseng, offering new insights into the molecular basis of metabolic variation in cultivated American ginseng.

## Linked entities

- **Chemicals:** ginsenosides (PubChem CID 3086007)
- **Species:** Panax quinquefolius (taxon 44588)

## Full-text entities

- **Species:** Panax quinquefolius (American ginseng, species) [taxon 44588]

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12027468/full.md

---
Source: https://tomesphere.com/paper/PMC12027468