# Unified protein–small molecule graph neural networks for binding site prediction

**Authors:** Jian Wang, Nikolay V. Dokholyan

PMC · DOI: 10.1073/pnas.2524913123 · 2026-03-03

## TL;DR

YuelPocket is a new AI tool that accurately predicts where small molecules bind to proteins, improving drug discovery and virtual screening.

## Contribution

YuelPocket introduces a unified graph neural network that models both local and global protein–ligand interactions for accurate binding site prediction.

## Key findings

- YuelPocket outperforms state-of-the-art methods in Distance to Closest Atom and Center-to-Center metrics.
- The model is robust on AlphaFold-predicted structures, maintaining accuracy even with structural deviations.
- YuelPocket operates in two modes: residue-level and coordinate-level prediction for comprehensive binding site detection.

## Abstract

Accurately identifying small molecule binding sites on proteins is fundamental to understanding protein function and enabling structure-based drug discovery, yet this critical step remains a major bottleneck in biomedical research and therapeutic development. Failures in virtual screening and lead optimization are often attributable to incorrect binding site identification rather than limitations in docking algorithms or scoring functions. We present YuelPocket, a unified graph neural network that overcomes this fundamental challenge by integrating both local and global protein–small molecule interactions within a single, scalable framework. YuelPocket achieves high predictive accuracy and offers a robust solution for precise binding site detection, providing a transformative tool for improving virtual screening and rational drug design.

Predicting small molecule binding sites on proteins remains a key challenge in structure-based drug discovery. While AlphaFold3 has transformed protein structure prediction, accurate identification of functional sites such as ligand binding pockets remains a distinct and unresolved problem. Graph neural networks have emerged as promising tools for this task, but most current approaches focus on local structural features and are trained on relatively small datasets, limiting their ability to model long-range protein–ligand interactions. Here, we develop YuelPocket, a graph neural network that addresses these limitations. YuelPocket operates in two complementary modes: residue-level prediction for identifying contact residues and coordinate-level prediction for pinpointing pocket centers. Trained on the large-scale PLINDER dataset, YuelPocket achieves higher success rates in both Distance to Closest Atom and Center-to-Center metrics compared to the state-of-the-art methods. Crucially, YuelPocket demonstrates high robustness on AlphaFold-predicted structures, maintaining high accuracy for targets with deviations from experimental structures. We hope that YuelPocket will serve as a robust framework for accurate binding site identification, enabling reliable functional annotation and structure-guided drug discovery.

## Full-text entities

- **Genes:** H4C4 (H4 clustered histone 4) [NCBI Gene 8360] {aka H4/b, H4FB, HIST1H4D, dJ221C16.9}, TOP1 (DNA topoisomerase I) [NCBI Gene 7150] {aka TOPI}, DCC (DCC netrin 1 receptor) [NCBI Gene 1630] {aka CRC18, CRCR1, HGPPS2, IGDCC1, MRMV1, NTN1R1}
- **Diseases:** PLI (MESH:C563663)
- **Chemicals:** PNAS (MESH:D020135), C (MESH:D002244), N (MESH:D009584), 1DY4 (-)
- **Cell lines:** S2 — Drosophila melanogaster (Fruit fly), Spontaneously immortalized cell line (CVCL_Z232)

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12974528/full.md

---
Source: https://tomesphere.com/paper/PMC12974528