# XuanHuGPT: parameter-efficient fine-tuning of large language model in the field of traditional Chinese medicine

**Authors:** Xuming Tong, Xiaozheng Ding, Huiru Jia, Yanhong Yuan, Liyan Liu, Yapeng Wang, Zhang Xiong, Xu Yang, Sio Kei Im, Mini Han Wang

PMC · DOI: 10.1186/s13020-025-01200-3 · 2025-11-26

## TL;DR

XuanHuGPT is a specialized AI model for Traditional Chinese Medicine, trained with a new dataset and efficient tuning methods to improve accuracy and performance.

## Contribution

A novel parameter-efficient fine-tuning approach for TCM-specific LLMs using a structured dataset and comprehensive evaluation framework.

## Key findings

- XuanHuGPT outperforms general and TCM-specific models in accuracy, coverage, and fluency.
- The XhTCM dataset with 100,000 entries enhances model training for TCM tasks.
- PEFT techniques effectively balance performance and training costs for domain-specific models.

## Abstract

Large Language Models (LLMs) have demonstrated exceptional generalization capabilities across various fields, including their application in Traditional Chinese Medicine (TCM). However, the performance of existing LLMs in TCM-specific tasks remains limited due to the lack of optimization for TCM knowledge during the pre-training phase, insufficient datasets, and the constraints of fine-tuning techniques. To address these challenges, this study constructs the XhTCM dataset by systematically integrating data from three authoritative sources—ShenNong_TCM_Dataset, TCMBank, and TCMIP v2.0. The dataset includes 100,000 structured entries, covering classical theories, prescription formulations, herbal pharmacology, and modern clinical practices. Based on this, we present XuanHuGPT, a domain-specific LLM tailored for TCM question answering and inference. By applying Parameter-Efficient Fine-Tuning (PEFT) techniques, we effectively balance model performance and training costs. Furthermore, we establish a comprehensive evaluation framework for TCM LLMs, combining quantitative metrics (BLEU, ROUGE, METEOR, BERTScore, and Embedding Distance) with expert qualitative assessments. Experimental results show that XuanHuGPT significantly outperforms both general-purpose LLMs and some existing TCM-specific models in accuracy, coverage, fluency, consistency, sensitivity, and safety. This study presents a reproducible paradigm for building intelligent TCM Q&A systems, contributing to the digital transformation, intelligent development, and global dissemination of TCM knowledge.

## Full-text entities

- **Diseases:** dyspepsia (MESH:D004415), LCS (MESH:D000083102), PEFT (MESH:C566019), gouty nephropathy (MESH:C537696), TCM (MESH:C562377), Kidney Qi deficiency (MESH:D007680), LLM (MESH:D007806), vomiting (MESH:D014839), abdominal (MESH:D000007), hallucinations (MESH:D006212), asthma (MESH:D001249), dizziness (MESH:D004244), COVID-19 (MESH:D000086382), wheezing (MESH:D012135), Cold (MESH:D000067390), nausea (MESH:D009325), phlegm cough (MESH:D003371), spleen deficiency (MESH:D013160), RLHF (MESH:D007859), Q&amp;A (MESH:D011778)
- **Chemicals:** RAG (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]
- **Cell lines:** BLEU-2 — Homo sapiens (Human), Colon carcinoma, Cancer cell line (CVCL_A628), BLEU-3 — Mus musculus (Mouse), Hybridoma (CVCL_C6V6)

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12648848/full.md

---
Source: https://tomesphere.com/paper/PMC12648848