Baichuan-M1: Pushing the Medical Capability of Large Language Models

Bingning Wang; Haizhou Zhao; Huozhi Zhou; Liang Song; Mingyu Xu; Wei; Cheng; Xiangrong Zeng; Yupeng Zhang; Yuqi Huo; Zecheng Wang; Zhengyun Zhao,; Da Pan; Fei Kou; Fei Li; Fuzhong Chen; Guosheng Dong; Han Liu; Hongda Zhang,; Jin He; Jinjie Yang; Kangxi Wu; Kegeng Wu; Lei Su; Linlin Niu; Linzhuang Sun,; Mang Wang; Pengcheng Fan; Qianli Shen; Rihui Xin; Shunya Dang; Songchi Zhou,; Weipeng Chen; Wenjing Luo; Xin Chen; Xin Men; Xionghai Lin; Xuezhen Dong; Yan; Zhang; Yifei Duan; Yuyan Zhou; Zhi Ma; Zhiying Wu

arXiv:2502.12671·cs.CL·March 6, 2025·5 cites

Baichuan-M1: Pushing the Medical Capability of Large Language Models

Bingning Wang, Haizhou Zhao, Huozhi Zhou, Liang Song, Mingyu Xu, Wei, Cheng, Xiangrong Zeng, Yupeng Zhang, Yuqi Huo, Zecheng Wang, Zhengyun Zhao,, Da Pan, Fei Kou, Fei Li, Fuzhong Chen, Guosheng Dong, Han Liu, Hongda Zhang,, Jin He, Jinjie Yang, Kangxi Wu, Kegeng Wu, Lei Su

PDF

Open Access 2 Models

TL;DR

Baichuan-M1 is a domain-specific large language model trained from scratch on medical data, achieving strong performance in both general and medical tasks, and is open-sourced for broader use.

Contribution

We developed Baichuan-M1, a medical domain-specific LLM trained from scratch on 20 trillion tokens, focusing on enhancing medical capabilities beyond traditional fine-tuning methods.

Findings

01

Performs well in general tasks like mathematics and coding

02

Excels in specialized medical fields

03

Open-sourced model for community use

Abstract

The current generation of large language models (LLMs) is typically designed for broad, general-purpose applications, while domain-specific LLMs, especially in vertical fields like medicine, remain relatively scarce. In particular, the development of highly efficient and practical LLMs for the medical domain is challenging due to the complexity of medical knowledge and the limited availability of high-quality data. To bridge this gap, we introduce Baichuan-M1, a series of large language models specifically optimized for medical applications. Unlike traditional approaches that simply continue pretraining on existing models or apply post-training to a general base model, Baichuan-M1 is trained from scratch with a dedicated focus on enhancing medical capabilities. Our model is trained on 20 trillion tokens and incorporates a range of effective training methods that strike a balance between…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRadiomics and Machine Learning in Medical Imaging

MethodsFocus · Balanced Selection