# GastroTCM: a large language model assistant for gastroenterology in traditional Chinese medicine

**Authors:** Lan Wang, Kaiqiang Tang, Zhichuan Yang, Yan Wang, Peng Zhang, Bowen Wu, Weibo Zhao, Jun Chen, Jiamin Gong, Shiyu Du, Shao Li

PMC · DOI: 10.1186/s13020-025-01295-8 · 2026-01-22

## TL;DR

GastroTCM is a specialized AI assistant for Traditional Chinese Medicine gastroenterology that improves accuracy and reduces errors through retrieval-augmented generation and multi-turn dialogue.

## Contribution

GastroTCM introduces a retrieval-augmented LLM with multi-turn dialogue and agent-based reasoning for TCM gastroenterology.

## Key findings

- GastroTCM outperformed baselines in single-turn and multi-turn dialogue evaluations.
- Expert review confirmed higher diagnostic accuracy and safety with fewer hallucinations.
- The RAG module significantly reduced unsupported statements in clinical responses.

## Abstract

Large language models (LLMs) show promise for supporting Traditional Chinese Medicine (TCM) practice, but their clinical utility is limited by domain-specific knowledge gaps, hallucinations, and weak multi-turn reasoning. We present GastroTCM, a specialised LLM assistant for TCM gastroenterology that we built by fine-tuning a Llama3-8B model and augmenting it with a Retrieval-Augmented Generation (RAG) and an agent framework. GastroTCM targets key shortcomings in current TCM diagnostic support through three components: (1) a dedicated TCM gastroenterology vector database for efficient retrieval of high-value, peer-reviewed knowledge; (2) ShareGPT-style multi-turn dialogue optimisation to preserve clinical context across rounds; and (3) an intelligent agent that dynamically adapts its responses to evolving symptom profiles and user intent.

GastroTCM was trained on approximately 20 million tokens of de-identified clinical records, guideline-based content, and expert-curated TCM question–answer pairs and evaluated against strong Chinese LLM baselines (ChatGLM-6B, Qwen-2). In automatic evaluations, GastroTCM outperformed all baselines in single-turn dialogue (BLEU: 0.334 vs. 0.172–0.246) and multi-turn consultations, where it achieved a substantially higher rate of proactive, clinically appropriate interactions (27/60 vs. ≤ 2/60 cases). Expert review by TCM gastroenterologists further confirmed higher diagnostic accuracy and safety, with the RAG module markedly reducing unsupported or hallucinated statements. These findings suggest that domain-specific, retrieval-enhanced LLMs can meaningfully augment—rather than replace—TCM practitioners in gastroenterology, with the potential to improve access to high-quality, explainable decision support in real-world settings.

## Full-text entities

- **Diseases:** IBD (MESH:D015212), head and neck cancer (MESH:D006258), food intolerance (MESH:D000073923), gastric cancer (MESH:D013274), cancer (MESH:D009369), hallucinations (MESH:D006212), digestive system disorders (MESH:D004066), TCM (MESH:C562377), IBS (MESH:D053560), vomiting (MESH:D014839), LLM (MESH:D007806), indigestion (MESH:D004415), chronic gastritis (MESH:D005756), diarrhea (MESH:D003967), inflammation (MESH:D007249), bloating (MESH:C535647), pain (MESH:D010146), and stomach weakness (MESH:D013272), constipation (MESH:D003248), abdominal pain (MESH:D015746), Comorbid Diseases (MESH:D004194), GERD (MESH:D005764), liver diseases (MESH:D008107), CAG (MESH:D005757), LoRA (MESH:D018489), diabetes (MESH:D003920), gastrointestinal diseases (MESH:D005767)
- **Chemicals:** RAG (-)
- **Species:** Helicobacter pylori (species) [taxon 210], Homo sapiens (human, species) [taxon 9606]

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12825222/full.md

---
Source: https://tomesphere.com/paper/PMC12825222