Towards Building Multilingual Language Model for Medicine
Pengcheng Qiu, Chaoyi Wu, Xiaoman Zhang, Weixiong Lin, Haicheng Wang,, Ya Zhang, Yanfeng Wang, Weidi Xie

TL;DR
This paper introduces a large-scale multilingual medical corpus, a benchmark, and a new model that significantly advances multilingual medical language understanding, rivaling even GPT-4 in performance.
Contribution
The paper presents a comprehensive multilingual medical corpus, a benchmark for evaluation, and a new 8B parameter model that outperforms existing open-source models in medical NLP tasks.
Findings
The MMed-Llama 3 model outperforms other open-source models on the benchmark.
The large-scale corpus enables effective domain adaptation for multilingual medical LLMs.
The model rivals GPT-4 in performance despite having fewer parameters.
Abstract
The development of open-source, multilingual medical language models can benefit a wide, linguistically diverse audience from different regions. To promote this domain, we present contributions from the following: First, we construct a multilingual medical corpus, containing approximately 25.5B tokens encompassing 6 main languages, termed as MMedC, enabling auto-regressive domain adaptation for general LLMs; Second, to monitor the development of multilingual medical LLMs, we propose a multilingual medical multi-choice question-answering benchmark with rationale, termed as MMedBench; Third, we have assessed a number of open-source large language models (LLMs) on our benchmark, along with those further auto-regressive trained on MMedC. Our final model, MMed-Llama 3, with only 8B parameters, achieves superior performance compared to all other open-source models on both MMedBench and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗Henrychur/MMedLM2model· 28 dl· ♡ 1228 dl♡ 12
- 🤗Henrychur/MMedLMmodel· 13 dl· ♡ 613 dl♡ 6
- 🤗Henrychur/MMedLM2-1_8Bmodel· 3 dl· ♡ 23 dl♡ 2
- 🤗Henrychur/MMed-Llama-3-8Bmodel· 1.7k dl· ♡ 321.7k dl♡ 32
- 🤗Henrychur/MMed-Llama-3-8B-EnInsmodel· 226 dl· ♡ 5226 dl♡ 5
- 🤗Hazy2028/pytorch_model-00001-of-00003.binmodel· 1 dl1 dl
- 🤗longisland3/MMed-Llama-3-8B-ggufmodel· 3 dl· ♡ 13 dl♡ 1
- 🤗RichardErkhov/Henrychur_-_MMed-Llama-3-8B-ggufmodel· 4 dl4 dl
- 🤗RichardErkhov/Henrychur_-_MMed-Llama-3-8B-EnIns-ggufmodel· 30 dl30 dl
- 🤗QuantFactory/MMed-Llama-3-8B-GGUFmodel· 104 dl· ♡ 3104 dl♡ 3
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Topic Modeling
MethodsAttention Is All You Need · Linear Layer · Dense Connections · Label Smoothing · Adam · Softmax · Multi-Head Attention · Layer Normalization · Residual Connection · Absolute Position Encodings
