Reinforcement learning for LLM-based explainable TCM prescription recommendation with implicit preferences from small language models
Xinyu Wang, Xiaohe Sun, Lei Yang, Yitong Zhang, Tao Yang, Jiadong Xie, Kongfa Hu

TL;DR
This paper introduces a two-stage framework using reinforcement learning and knowledge distillation to improve the accuracy and explainability of Traditional Chinese Medicine prescription recommendations.
Contribution
A novel two-stage training framework combining knowledge distillation and implicit preference-driven reinforcement learning for explainable TCM prescription recommendation.
Findings
The model achieves a P@30 of 35.62% and F1@30 of 37.36%, outperforming existing baselines.
Knowledge distillation improves generalization and explainability, while reinforcement learning enhances F1@30 by 2.01%.
Abstract
In an effort to reinforce both the interpretability and accuracy of prescription recommendations in Traditional Chinese Medicine (TCM), this study puts forward a two-stage training framework that integrates knowledge distillation from a teacher model with implicit preference-driven reinforcement learning grounded in a compact model. Above all, GPT-4o is employed to parse structured TCM clinical records, creating high-quality distillation samples. These are employed to guide Low-Rank Adaptation (LoRA)-based fine-tuning of the Qwen2.5-7B model, enabling it to generate explainable outputs in the format of "symptom analysis—prescription recommendation—prescription explanation". Then, a lightweight BART (Bidirectional and Auto-Regressive Transformers) model is trained to learn the mapping relation between symptoms and prescriptions. Its outputs are compared with those of the large model to…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16
Figure 17
Figure 18
Figure 19
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Machine Learning in Healthcare · Artificial Intelligence in Healthcare and Education
