HuatuoGPT-II, One-stage Training for Medical Adaption of LLMs
Junying Chen, Xidong Wang, Ke Ji, Anningzhe Gao, Feng Jiang, Shunian, Chen, Hongbo Zhang, Dingjie Song, Wenya Xie, Chuyi Kong, Jianquan Li, Xiang, Wan, Haizhou Li, Benyou Wang

TL;DR
HuatuoGPT-II introduces a one-stage training protocol transforming heterogeneous medical data into a unified format, achieving state-of-the-art results in Chinese medicine and surpassing proprietary models like ChatGPT and GPT-4 in specific benchmarks.
Contribution
The paper proposes a novel one-stage training method for domain adaptation of LLMs, simplifying data processing and improving performance in Chinese medicine applications.
Findings
HuatuoGPT-II outperforms ChatGPT and GPT-4 in Chinese medicine benchmarks.
Achieved top results in Chinese National Medical Licensing Examination.
Validated effectiveness through expert manual evaluations.
Abstract
Adapting a language model into a specific domain, a.k.a `domain adaption', is a common practice when specialized knowledge, e.g. medicine, is not encapsulated in a general language model like Llama2. The challenge lies in the heterogeneity of data across the two training stages, as it varies in languages, genres, or formats. To tackle this and simplify the learning protocol, we propose to transform heterogeneous data, from the both pre-training and supervised stages, into a unified, simple input-output pair format. We validate the new protocol in the domains where proprietary LLMs like ChatGPT perform relatively poorly, such as Traditional Chinese Medicine. The developed model, HuatuoGPT-II, has shown state-of-the-art performance in Chinese medicine domain on a number of benchmarks, e.g. medical licensing exams. It even outperforms proprietary models like ChatGPT and GPT-4 in some…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗FreedomIntelligence/HuatuoGPT2-7Bmodel· 509 dl· ♡ 11509 dl♡ 11
- 🤗FreedomIntelligence/HuatuoGPT2-13Bmodel· 11 dl· ♡ 811 dl♡ 8
- 🤗FreedomIntelligence/HuatuoGPT2-34Bmodel· 6 dl· ♡ 86 dl♡ 8
- 🤗FreedomIntelligence/HuatuoGPT2-7B-4bitsmodel· 25 dl· ♡ 425 dl♡ 4
- 🤗FreedomIntelligence/HuatuoGPT2-7B-8bitsmodel· 12 dl· ♡ 212 dl♡ 2
- 🤗FreedomIntelligence/HuatuoGPT2-34B-8bitsmodel· 3 dl· ♡ 13 dl♡ 1
- 🤗FreedomIntelligence/HuatuoGPT2-34B-4bitsmodel· 11 dl· ♡ 311 dl♡ 3
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Machine Learning in Healthcare
MethodsMulti-Head Attention · Attention Is All You Need · Adam · Softmax · Dense Connections · Linear Layer · Position-Wise Feed-Forward Layer · Label Smoothing · Absolute Position Encodings · Residual Connection
