TwinBooster: Synergising Large Language Models with Barlow Twins and Gradient Boosting for Enhanced Molecular Property Prediction
Maximilian G. Schuh, Davide Boldini, Stephan A. Sieber

TL;DR
TwinBooster combines large language models, Barlow Twins, and gradient boosting to improve molecular property prediction, especially in data-scarce scenarios, enabling zero-shot learning and accelerating drug discovery.
Contribution
The paper introduces a novel architecture integrating LLMs with Barlow Twins and gradient boosting for enhanced molecular property prediction, including zero-shot capabilities.
Findings
State-of-the-art performance on FS-Mol benchmark
Effective zero-shot prediction for unseen bioassays and molecules
Demonstrates deep learning's potential in data-scarce property prediction
Abstract
The success of drug discovery and development relies on the precise prediction of molecular activities and properties. While in silico molecular property prediction has shown remarkable potential, its use has been limited so far to assays for which large amounts of data are available. In this study, we use a fine-tuned large language model to integrate biological assays based on their textual information, coupled with Barlow Twins, a Siamese neural network using a novel self-supervised learning approach. This architecture uses both assay information and molecular fingerprints to extract the true molecular information. TwinBooster enables the prediction of properties of unseen bioassays and molecules by providing state-of-the-art zero-shot learning tasks. Remarkably, our artificial intelligence pipeline shows excellent performance on the FS-Mol benchmark. This breakthrough demonstrates…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Machine Learning in Materials Science · Chemical Synthesis and Analysis
MethodsBarlow Twins
