Predicting Activity Cliffs for Autonomous Medicinal Chemistry
Michael Cuccarese

TL;DR
This study develops a model to predict activity cliffs in medicinal chemistry, reducing experimental efforts by identifying key positions where small changes cause large potency shifts, using a large dataset and open-source tools.
Contribution
Introduces an 11-feature model with 3D pharmacophore context for activity cliff prediction, generalizing across targets and scaffolds, and provides open-source code and webapp.
Findings
Model achieves high NDCG@3 scores (0.910) for activity cliff prediction.
Reduces chemist exploration from 3.1 to 2.1 positions, halving initial experiments.
Predicting modification outcomes from structure alone is limited (Spearman 0.268).
Abstract
Activity cliff prediction - identifying positions where small structural changes cause large potency shifts - has been a persistent challenge in computational medicinal chemistry. This work focuses on a parsimonious definition: which small modifications, at which positions, confer the highest probability of an outcome change. Position-level sensitivity is calculated using 25 million matched molecular pairs from 50 ChEMBL targets across six protein families, revealing that two questions have fundamentally different answers. "Which positions vary most?" is answered by scaffold size alone (NDCG@3 = 0.966), requiring no machine learning. "Which are true activity cliffs?" - where small modifications cause disproportionately large effects, as captured by SALI normalization - requires an 11-feature model with 3D pharmacophore context (NDCG@3 = 0.910 vs. 0.839 random), generalizing across all…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
