CLaDMoP: Learning Transferrable Models from Successful Clinical Trials via LLMs

Yiqing Zhang; Xiaozhong Liu; Fabricio Murai

arXiv:2505.18527·cs.LG·May 27, 2025

CLaDMoP: Learning Transferrable Models from Successful Clinical Trials via LLMs

Yiqing Zhang, Xiaozhong Liu, Fabricio Murai

PDF

Open Access

TL;DR

CLaDMoP introduces a novel pre-training approach using large language models and a multi-level fusion technique to improve clinical trial outcome prediction, achieving significant performance gains over existing methods.

Contribution

The paper presents CLaDMoP, a new pre-training method that leverages LLMs and a pair matching proxy task, enhancing generalization and performance in clinical trial outcome prediction.

Findings

01

Significantly improves PR-AUC and ROC-AUC over baselines.

02

Achieves up to 10.5% improvement in PR-AUC.

03

Performs well after Parameter-Efficient Fine-Tuning.

Abstract

Many existing models for clinical trial outcome prediction are optimized using task-specific loss functions on trial phase-specific data. While this scheme may boost prediction for common diseases and drugs, it can hinder learning of generalizable representations, leading to more false positives/negatives. To address this limitation, we introduce CLaDMoP, a new pre-training approach for clinical trial outcome prediction, alongside the Successful Clinical Trials dataset(SCT), specifically designed for this task. CLaDMoP leverages a Large Language Model-to encode trials' eligibility criteria-linked to a lightweight Drug-Molecule branch through a novel multi-level fusion technique. To efficiently fuse long embeddings across levels, we incorporate a grouping block, drastically reducing computational overhead. CLaDMoP avoids reliance on task-specific objectives by pre-training on a "pair…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Computing and Data Management · Machine Learning in Healthcare