A Survey on Large Language Models from General Purpose to Medical Applications: Datasets, Methodologies, and Evaluations
Jinqiang Wang, Huansheng Ning, Yi Peng, Qikai Wei, Daniel Tesfai,, Wenwei Mao, Tao Zhu, Runhe Huang

TL;DR
This survey reviews the development, training, and evaluation of medical large language models based on general-purpose LLMs, highlighting datasets, methodologies, challenges, and future research directions in medical applications.
Contribution
It provides a comprehensive, fine-grained overview of training medical LLMs from open-source general models, including dataset construction, training paradigms, and evaluation benchmarks.
Findings
Medical LLMs excel in doctor-patient dialogues and diagnosis.
Fine-tuning open-source models reduces computational costs and enhances privacy.
The survey identifies key challenges and future research directions in medical LLM development.
Abstract
Large Language Models (LLMs) have demonstrated surprising performance across various natural language processing tasks. Recently, medical LLMs enhanced with domain-specific knowledge have exhibited excellent capabilities in medical consultation and diagnosis. These models can smoothly simulate doctor-patient dialogues and provide professional medical advice. Most medical LLMs are developed through continued training of open-source general LLMs, which require significantly fewer computational resources than training LLMs from scratch. Additionally, this approach offers better patient privacy protection than API-based solutions. Given the above advantages, this survey systematically summarizes how to train medical LLMs based on open-source general LLMs from a more fine-grained perspective. It covers (a) how to acquire training corpus and construct customized medical training sets, (b) how…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Machine Learning in Healthcare
