DoctorGLM: Fine-tuning your Chinese Doctor is not a Herculean Task

Honglin Xiong; Sheng Wang; Yitao Zhu; Zihao Zhao; Yuxiao Liu; Linlin; Huang; Qian Wang; Dinggang Shen

arXiv:2304.01097·cs.CL·April 18, 2023·71 cites

DoctorGLM: Fine-tuning your Chinese Doctor is not a Herculean Task

Honglin Xiong, Sheng Wang, Yitao Zhu, Zihao Zhao, Yuxiao Liu, Linlin, Huang, Qian Wang, Dinggang Shen

PDF

Open Access 1 Repo

TL;DR

DoctorGLM is a cost-effective, Chinese medical dialogue model fine-tuned from ChatGLM-6B, aiming to improve healthcare AI accessibility and performance with a quick, affordable training process.

Contribution

This work demonstrates fine-tuning a large Chinese language model for medical dialogue tasks using accessible hardware, making healthcare AI development more feasible for hospitals.

Findings

01

Fine-tuned ChatGLM-6B on medical dialogues in 13 hours

02

Achieved affordable healthcare-specific LLM training

03

Shared initial model for community feedback

Abstract

The recent progress of large language models (LLMs), including ChatGPT and GPT-4, in comprehending and responding to human instructions has been remarkable. Nevertheless, these models typically perform better in English and have not been explicitly trained for the medical domain, resulting in suboptimal precision in diagnoses, drug recommendations, and other medical advice. Additionally, training and deploying a dialogue model is still believed to be impossible for hospitals, hindering the promotion of LLMs. To tackle these challenges, we have collected databases of medical dialogues in Chinese with ChatGPT's help and adopted several techniques to train an easy-deploy LLM. Remarkably, we were able to fine-tune the ChatGLM-6B on a single A100 80G in 13 hours, which means having a healthcare-purpose LLM can be very affordable. DoctorGLM is currently an early-stage engineering attempt and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xionghonglin/doctorglm
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · Machine Learning in Healthcare · Topic Modeling

MethodsMulti-Head Attention · Attention Is All You Need · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Softmax · Linear Layer · Byte Pair Encoding · Layer Normalization · Residual Connection · Dense Connections