Modeling PROTAC Degradation Activity with Machine Learning
Stefano Ribes, Eva Nittinger, Christian Tyrchan, Roc\'io Mercado

TL;DR
This paper introduces an open-source deep learning approach for predicting PROTAC degradation activity, utilizing curated data and pretrained embeddings, achieving high accuracy and facilitating drug discovery.
Contribution
It presents a novel curated dataset and a deep learning model leveraging pretrained embeddings for PROTAC activity prediction, improving reproducibility and reducing computational complexity.
Findings
Top test accuracy of 80.8% and ROC AUC of 0.865
Generalization to new targets with 62.3% accuracy and 0.604 ROC AUC
Model performance is comparable to state-of-the-art methods
Abstract
PROTACs are a promising therapeutic modality that harnesses the cell's built-in degradation machinery to degrade specific proteins. Despite their potential, developing new PROTACs is challenging and requires significant domain expertise, time, and cost. Meanwhile, machine learning has transformed drug design and development. In this work, we present a strategy for curating open-source PROTAC data and an open-source deep learning tool for predicting the degradation activity of novel PROTAC molecules. The curated dataset incorporates important information such as , , E3 ligase type, POI amino acid sequence, and experimental cell type. Our model architecture leverages learned embeddings from pretrained machine learning models, in particular for encoding protein sequences and cell type information. We assessed the quality of the curated data and the generalization ability…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProtein Degradation and Inhibitors · Ubiquitin and proteasome pathways
