Exploring Large Language Models in Healthcare: Insights into Corpora   Sources, Customization Strategies, and Evaluation Metrics

Shuqi Yang; Mingrui Jing; Shuai Wang; Jiaxin Kou; Manfei Shi; Weijie; Xing; Yan Hu; Zheng Zhu

arXiv:2502.11861·cs.CL·February 18, 2025

Exploring Large Language Models in Healthcare: Insights into Corpora Sources, Customization Strategies, and Evaluation Metrics

Shuqi Yang, Mingrui Jing, Shuai Wang, Jiaxin Kou, Manfei Shi, Weijie, Xing, Yan Hu, Zheng Zhu

PDF

Open Access

TL;DR

This paper reviews how large language models are used in healthcare, analyzing their data sources, customization methods, and evaluation metrics, highlighting gaps in fairness, evidence integration, and validation for real-world deployment.

Contribution

It provides a comprehensive overview of healthcare-specific LLM training corpora, customization strategies, and evaluation metrics, identifying key gaps and proposing future research directions.

Findings

01

Four types of corpora used: clinical, literature, open-source, web data.

02

Common techniques include pre-training, prompt engineering, retrieval-augmented generation.

03

Identified gaps in corpus fairness and lack of standardized evaluation frameworks.

Abstract

This study reviewed the use of Large Language Models (LLMs) in healthcare, focusing on their training corpora, customization techniques, and evaluation metrics. A systematic search of studies from 2021 to 2024 identified 61 articles. Four types of corpora were used: clinical resources, literature, open-source datasets, and web-crawled data. Common construction techniques included pre-training, prompt engineering, and retrieval-augmented generation, with 44 studies combining multiple methods. Evaluation metrics were categorized into process, usability, and outcome metrics, with outcome metrics divided into model-based and expert-assessed outcomes. The study identified critical gaps in corpus fairness, which contributed to biases from geographic, cultural, and socio-economic factors. The reliance on unverified or unstructured data highlighted the need for better integration of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Computational and Text Analysis Methods

MethodsFocus