When Raw Data Prevails: Are Large Language Model Embeddings Effective in   Numerical Data Representation for Medical Machine Learning Applications?

Yanjun Gao; Skatje Myers; Shan Chen; Dmitriy Dligach; Timothy A; Miller; Danielle Bitterman; Matthew Churpek; Majid Afshar

arXiv:2408.11854·cs.CL·September 23, 2024

When Raw Data Prevails: Are Large Language Model Embeddings Effective in Numerical Data Representation for Medical Machine Learning Applications?

Yanjun Gao, Skatje Myers, Shan Chen, Dmitriy Dligach, Timothy A, Miller, Danielle Bitterman, Matthew Churpek, Majid Afshar

PDF

Open Access

TL;DR

This paper investigates whether large language model embeddings can effectively represent numerical medical data for machine learning tasks, comparing their performance to raw data in clinical prediction models.

Contribution

It evaluates the effectiveness of zero-shot LLM embeddings as feature extractors for medical diagnostics, highlighting their competitive performance against raw numerical data.

Findings

01

Raw data features still outperform LLM embeddings in medical ML tasks.

02

Zero-shot LLM embeddings show promising results as feature representations.

03

Prompt engineering impacts the utility of LLM embeddings in clinical predictions.

Abstract

The introduction of Large Language Models (LLMs) has advanced data representation and analysis, bringing significant progress in their use for medical questions and answering. Despite these advancements, integrating tabular data, especially numerical data pivotal in clinical contexts, into LLM paradigms has not been thoroughly explored. In this study, we examine the effectiveness of vector representations from last hidden states of LLMs for medical diagnostics and prognostics using electronic health record (EHR) data. We compare the performance of these embeddings with that of raw numerical EHR data when used as feature inputs to traditional machine learning (ML) algorithms that excel at tabular data learning, such as eXtreme Gradient Boosting. We focus on instruction-tuned LLMs in a zero-shot setting to represent abnormal physiological data and evaluating their utilities as feature…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · Topic Modeling

MethodsFocus