Large Language Models for Healthcare Data Augmentation: An Example on Patient-Trial Matching
Jiayi Yuan, Ruixiang Tang, Xiaoqian Jiang, Xia Hu

TL;DR
This paper investigates how large language models can enhance patient-trial matching in healthcare by improving data compatibility and privacy, demonstrating significant performance gains and generalizability in experimental evaluations.
Contribution
It introduces a privacy-aware data augmentation method using LLMs for patient-trial matching, addressing interoperability and confidentiality challenges in healthcare data.
Findings
7.32% performance improvement
12.12% better generalizability
Effective case studies illustrating approach
Abstract
The process of matching patients with suitable clinical trials is essential for advancing medical research and providing optimal care. However, current approaches face challenges such as data standardization, ethical considerations, and a lack of interoperability between Electronic Health Records (EHRs) and clinical trial criteria. In this paper, we explore the potential of large language models (LLMs) to address these challenges by leveraging their advanced natural language generation capabilities to improve compatibility between EHRs and clinical trial descriptions. We propose an innovative privacy-aware data augmentation approach for LLM-based patient-trial matching (LLM-PTM), which balances the benefits of LLMs while ensuring the security and confidentiality of sensitive patient data. Our experiments demonstrate a 7.32% average improvement in performance using the proposed LLM-PTM…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Topic Modeling · Artificial Intelligence in Healthcare
