ProFSA: Self-supervised Pocket Pretraining via Protein Fragment-Surroundings Alignment
Bowen Gao, Yinjun Jia, Yuanle Mo, Yuyan Ni, Weiying Ma, Zhiming Ma,, Yanyan Lan

TL;DR
ProFSA introduces a self-supervised pocket pretraining method that simulates ligand-receptor interactions using protein fragments and pretrained small molecule representations, significantly improving various biomedical prediction tasks.
Contribution
The paper presents a novel contrastive pretraining approach for pocket representations that leverages high-resolution protein structures and simulated ligand interactions, addressing data scarcity issues.
Findings
Achieves state-of-the-art results in druggability prediction.
Outperforms existing pretraining methods by a large margin.
Demonstrates effective use of protein structure databases for interaction modeling.
Abstract
Pocket representations play a vital role in various biomedical applications, such as druggability estimation, ligand affinity prediction, and de novo drug design. While existing geometric features and pretrained representations have demonstrated promising results, they usually treat pockets independent of ligands, neglecting the fundamental interactions between them. However, the limited pocket-ligand complex structures available in the PDB database (less than 100 thousand non-redundant pairs) hampers large-scale pretraining endeavors for interaction modeling. To address this constraint, we propose a novel pocket pretraining approach that leverages knowledge from high-resolution atomic protein structures, assisted by highly effective pretrained small molecule representations. By segmenting protein structures into drug-like fragments and their corresponding pockets, we obtain a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Protein Structure and Dynamics · Chemical Synthesis and Analysis
MethodsALIGN
