TACK: A statistical evaluation of degradation activity on a novel TArgeting Chimeras Knowledge dataset
Stefano Ribes, Nils Dunlop, and Roc\'io Mercado

TL;DR
This paper introduces TACK, a comprehensive dataset and statistical evaluation framework for predicting PROTAC degradation activity, emphasizing feature engineering and classical machine learning methods over complex architectures.
Contribution
The study presents a novel dataset, rigorous benchmarking of ML methods, and insights into feature importance and uncertainty quantification for PROTAC activity prediction.
Findings
Cellular context features rival complex protein embeddings in prediction.
Potency ($pDC_{50}$) is more predictable than Dmax.
Classical ML methods outperform a domain-specific GNN model.
Abstract
Proteolysis-targeting chimeras (PROTACs) represent a promising therapeutic modality that induces targeted protein degradation by hijacking the ubiquitin-proteasome system. However, rational PROTAC design remains challenging due to the complex interplay between molecular structure, target proteins, E3 ligases, and the cellular context. We present TACK, a statistical evaluation of degradation activity on a novel TArgeting Chimeras Knowledge dataset of 3,514 PROTACs and 6,561 degradation endpoints aggregated from three major repositories with standardized molecular representations, protein annotations, and experimental conditions. Using scaffold-based 55 cross-validation, we perform a rigorous statistical comparison of three machine learning methods to predict PROTAC degradation activity across three tasks: and Dmax regression, and binary activity classification. Feature…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
