Reinforcement learning for LLM-based explainable TCM prescription recommendation with implicit preferences from small language models

Xinyu Wang; Xiaohe Sun; Lei Yang; Yitong Zhang; Tao Yang; Jiadong Xie; Kongfa Hu

PMC · DOI:10.1186/s13020-025-01250-7·November 19, 2025

Reinforcement learning for LLM-based explainable TCM prescription recommendation with implicit preferences from small language models

Xinyu Wang, Xiaohe Sun, Lei Yang, Yitong Zhang, Tao Yang, Jiadong Xie, Kongfa Hu

PDF

Open Access

TL;DR

This paper introduces a two-stage framework using reinforcement learning and knowledge distillation to improve the accuracy and explainability of Traditional Chinese Medicine prescription recommendations.

Contribution

A novel two-stage training framework combining knowledge distillation and implicit preference-driven reinforcement learning for explainable TCM prescription recommendation.

Findings

01

The model achieves a P@30 of 35.62% and F1@30 of 37.36%, outperforming existing baselines.

02

Knowledge distillation improves generalization and explainability, while reinforcement learning enhances F1@30 by 2.01%.

Abstract

In an effort to reinforce both the interpretability and accuracy of prescription recommendations in Traditional Chinese Medicine (TCM), this study puts forward a two-stage training framework that integrates knowledge distillation from a teacher model with implicit preference-driven reinforcement learning grounded in a compact model. Above all, GPT-4o is employed to parse structured TCM clinical records, creating high-quality distillation samples. These are employed to guide Low-Rank Adaptation (LoRA)-based fine-tuning of the Qwen2.5-7B model, enabling it to generate explainable outputs in the format of "symptom analysis—prescription recommendation—prescription explanation". Then, a lightweight BART (Bidirectional and Auto-Regressive Transformers) model is trained to learn the mapping relation between symptoms and prescriptions. Its outputs are compared with those of the large model to…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Figures19

Click any figure to enlarge with its caption.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Machine Learning in Healthcare · Artificial Intelligence in Healthcare and Education