Developing Healthcare Language Model Embedding Spaces

Niall Taylor; Dan Schofield; Andrey Kormilitzin; Dan W Joyce; Alejo; Nevado-Holgado

arXiv:2403.19802·cs.CL·April 1, 2024·Artif. Intell. Medicine·1 cites

Developing Healthcare Language Model Embedding Spaces

Niall Taylor, Dan Schofield, Andrey Kormilitzin, Dan W Joyce, Alejo, Nevado-Holgado

PDF

Open Access

TL;DR

This paper investigates methods to adapt small pre-trained language models for healthcare text, demonstrating that contrastive learning enhances classification performance and embedding quality, with implications for resource-efficient, domain-specific medical NLP applications.

Contribution

The study introduces a contrastive pre-training approach and evaluates metadata-based objectives for healthcare LLM adaptation, providing guidelines for efficient domain-specific model development.

Findings

01

Contrastively trained models outperform other methods on classification tasks.

02

Domain-adapted LLMs surpass general base LLMs in healthcare tasks.

03

Metadata pre-training improves embedding cluster separability.

Abstract

Pre-trained Large Language Models (LLMs) often struggle on out-of-domain datasets like healthcare focused text. We explore specialized pre-training to adapt smaller LLMs to different healthcare datasets. Three methods are assessed: traditional masked language modeling, Deep Contrastive Learning for Unsupervised Textual Representations (DeCLUTR), and a novel pre-training objective utilizing metadata categories from the healthcare settings. These schemes are evaluated on downstream document classification tasks for each dataset, with additional analysis of the resultant embedding spaces. Contrastively trained models outperform other approaches on the classification tasks, delivering strong performance from limited labeled data and with fewer model parameter updates required. While metadata-based pre-training does not further improve classifications across the datasets, it yields…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBiomedical Text Mining and Ontologies · Electronic Health Records Systems · Semantic Web and Ontologies

MethodsContrastive Learning · ALIGN · Balanced Selection