BLIAM: Literature-based Data Synthesis for Synergistic Drug Combination Prediction
Cai Yang, Addie Woicik, Hoifung Poon, Sheng Wang

TL;DR
BLIAM is a novel literature-based data synthesis method that generates interpretable, model-agnostic training data for drug combination prediction, improving performance and enabling in silico experimentation.
Contribution
BLIAM introduces an iterative prompt-based data synthesis approach that directly creates interpretable training data from scientific literature, addressing interpretability and data leakage issues.
Findings
BLIAM outperforms non-augmented and manual prompting methods.
It can synthesize data for unmeasured drugs and cell lines.
The approach enhances prediction accuracy and interpretability.
Abstract
Language models pre-trained on scientific literature corpora have substantially advanced scientific discovery by offering high-quality feature representations for downstream applications. However, these features are often not interpretable, and thus can reveal limited insights to domain experts. Instead of obtaining features from language models, we propose BLIAM, a literature-based data synthesis approach to directly generate training data points that are interpretable and model-agnostic to downstream applications. The key idea of BLIAM is to create prompts using existing training data and then use these prompts to synthesize new data points. BLIAM performs these two steps iteratively as new data points will define more informative prompts and new prompts will in turn synthesize more accurate data points. Notably, literature-based data augmentation might introduce data leakage since…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Biomedical Text Mining and Ontologies · Machine Learning in Materials Science
MethodsTest
