Accurate Prediction of Ligand-Protein Interaction Affinities with   Fine-Tuned Small Language Models

Ben Fauber

arXiv:2407.00111·cs.LG·July 2, 2024

Accurate Prediction of Ligand-Protein Interaction Affinities with Fine-Tuned Small Language Models

Ben Fauber

PDF

Open Access

TL;DR

This paper presents a novel approach using instruction fine-tuned small language models to accurately predict ligand-protein interaction affinities, outperforming traditional ML and FEP+ methods in zero-shot settings for drug discovery.

Contribution

It introduces a new method leveraging fine-tuned generative small language models for ligand-protein affinity prediction using only SMILES and amino acid sequences.

Findings

01

Outperforms ML and FEP+ methods in accuracy

02

Effective in zero-shot prediction scenarios

03

Applicable to challenging therapeutic targets

Abstract

We describe the accurate prediction of ligand-protein interaction (LPI) affinities, also known as drug-target interactions (DTI), with instruction fine-tuned pretrained generative small language models (SLMs). We achieved accurate predictions for a range of affinity values associated with ligand-protein interactions on out-of-sample data in a zero-shot setting. Only the SMILES string of the ligand and the amino acid sequence of the protein were used as the model inputs. Our results demonstrate a clear improvement over machine learning (ML) and free-energy perturbation (FEP+) based methods in accurately predicting a range of ligand-protein interaction affinities, which can be leveraged to further accelerate drug discovery campaigns against challenging therapeutic targets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputational Drug Discovery Methods · Bioinformatics and Genomic Networks · Biomedical Text Mining and Ontologies