Transfer Learning with Clinical Concept Embeddings from Large Language Models
Yuhe Gao, Runxue Bao, Yuelyu Ji, Yiming Sun, Chenxi Song, Jeffrey P., Ferraro, Ye Ye

TL;DR
This paper investigates how domain-specific large language models improve transfer learning in healthcare by analyzing clinical concept embeddings from electronic health records across multiple healthcare systems.
Contribution
It demonstrates that domain-specific LLMs like Med-BERT outperform generic models in transfer learning for clinical concepts, emphasizing the importance of model tuning and domain specificity.
Findings
Med-BERT outperforms generic models in local and transfer tasks
Fine-tuning generic models improves their performance
Excessive tuning can reduce the effectiveness of biomedical embeddings
Abstract
Knowledge sharing is crucial in healthcare, especially when leveraging data from multiple clinical sites to address data scarcity, reduce costs, and enable timely interventions. Transfer learning can facilitate cross-site knowledge transfer, but a major challenge is heterogeneity in clinical concepts across different sites. Large Language Models (LLMs) show significant potential of capturing the semantic meaning of clinical concepts and reducing heterogeneity. This study analyzed electronic health records from two large healthcare systems to assess the impact of semantic embeddings from LLMs on local, shared, and transfer learning models. Results indicate that domain-specific LLMs, such as Med-BERT, consistently outperform in local and direct transfer scenarios, while generic models like OpenAI embeddings require fine-tuning for optimal performance. However, excessive tuning of models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Text and Document Classification Technologies
