Large Language Models for Integrating Social Determinant of Health Data: A Case Study on Heart Failure 30-Day Readmission Prediction
Chase Fensore, Rodrigo M. Carrillo-Larco, Shivani A. Patel, Alanna A., Morris, Joyce C. Ho

TL;DR
This study evaluates the use of large language models to automatically annotate social determinants of health data and assesses their utility in predicting 30-day readmission in heart failure patients, demonstrating promising results with open-source LLMs.
Contribution
It introduces a novel end-to-end approach for using LLMs to integrate SDOH data for clinical prediction, including benchmarking LLM performance and analyzing annotation strategies.
Findings
Open-source LLMs can accurately annotate SDOH variables with zero-shot prompting.
LLM-annotated Neighborhood and Built Environment features improve readmission prediction.
Combining SDOH features with clinical data enhances model performance.
Abstract
Social determinants of health (SDOH) the myriad of circumstances in which people live, grow, and age play an important role in health outcomes. However, existing outcome prediction models often only use proxies of SDOH as features. Recent open data initiatives present an opportunity to construct a more comprehensive view of SDOH, but manually integrating the most relevant data for individual patients becomes increasingly challenging as the volume and diversity of public SDOH data grows. Large language models (LLMs) have shown promise at automatically annotating structured data. Here, we conduct an end-to-end case study evaluating the feasibility of using LLMs to integrate SDOH data, and the utility of these SDOH features for clinical prediction. We first manually label 700+ variables from two publicly-accessible SDOH data sources to one of five semantic SDOH categories. Then, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare
