RapidDock: Unlocking Proteome-scale Molecular Docking

Rafa{\l} Powalski; Bazyli Klockiewicz; Maciej Ja\'skowski; Bartosz; Topolski; Pawe{\l} D\k{a}browski-Tuma\'nski; Maciej Wi\'sniewski; {\L}ukasz; Kuci\'nski; Piotr Mi{\l}o\'s; Dariusz Plewczynski

arXiv:2411.00004·q-bio.BM·November 4, 2024

RapidDock: Unlocking Proteome-scale Molecular Docking

Rafa{\l} Powalski, Bazyli Klockiewicz, Maciej Ja\'skowski, Bartosz, Topolski, Pawe{\l} D\k{a}browski-Tuma\'nski, Maciej Wi\'sniewski, {\L}ukasz, Kuci\'nski, Piotr Mi{\l}o\'s, Dariusz Plewczynski

PDF

Open Access 3 Reviews

TL;DR

RapidDock is a transformer-based model that significantly accelerates molecular docking, enabling large-scale drug discovery with high accuracy and minimal inference time.

Contribution

We introduce RapidDock, a novel transformer-based approach that achieves over 100x speedup in molecular docking without sacrificing accuracy.

Findings

01

Achieves 52.1% and 44.0% success rates on benchmarks.

02

Inference time of 0.04 seconds per prediction on a single GPU.

03

Provides key architectural insights for transformer use in molecular docking.

Abstract

Accelerating molecular docking -- the process of predicting how molecules bind to protein targets -- could boost small-molecule drug discovery and revolutionize medicine. Unfortunately, current molecular docking tools are too slow to screen potential drugs against all relevant proteins, which often results in missed drug candidates or unexpected side effects occurring in clinical trials. To address this gap, we introduce RapidDock, an efficient transformer-based model for blind molecular docking. RapidDock achieves at least a $100 \times$ speed advantage over existing methods without compromising accuracy. On the Posebusters and DockGen benchmarks, our method achieves $52.1%$ and $44.0%$ success rates ( $RMSD < 2$ \r{A}), respectively. The average inference time is $0.04$ seconds on a single GPU, highlighting RapidDock's potential for large-scale docking studies. We examine the key…

Peer Reviews

Decision·Submitted to ICLR 2025

Reviewer 01Rating 6Confidence 4

Strengths

* Clear presentation of methods and results. * Novel application of transformer architecture to docking, which results in much faster inference. * Reasonable design choices in model. These include the addition of features for ligand atom charges and the use of a pre-trained protein language model. The scaling of the attention vector based on distance also seems well-motivated. * Strong results on two distinct datasets, achieving a better success rate than two competitive deep learning methods at

Weaknesses

* Motivation behind the problem setting is unclear. The authors address proteome-scale docking because any protein in the human proteome could be a potential *off-target* (a term of art that should probably be included in the paper) of a drug. Thus, docking against all proteins and then predicting affinity in a downstream task would detect these potential off-targets before they are discovered in later preclinical or clinical testing. However, I am not convinced that docking to each protein in t

Reviewer 02Rating 3Confidence 3

Strengths

1. This work shows the possibility of transformer-based approaches for binding structure predictions, while most previous works are based on diffusion methods. 2. The proposed method outperformed the popular base-line, DiffDock-L, in the two benchmark studies, while its computational time for predictions is much faster than those of all base-line models.

Weaknesses

1. The proposed method needs to generate 96 molecular conformations for each molecule and analyze the conformers to obtain its distance matrix. 2. The experiment in Section 4.2 seems meaningless because it uses holostructures when it predicts binding poses. 3. The title and introduction parts emphasize the importance of proteome-wide docking, but this work does not provide any meaning results regarding that. 4. Technical details of the proposed method are insufficient.

Reviewer 03Rating 5Confidence 5

Strengths

- The authors demonstrate the need for a faster model by outlining the scalability limitations of previous deep learning models. - Aligned with their motivation, RapidDock shows the ability to perform conformation sampling for molecular docking in GPU inference runtime in approximately one-hundredth of a second per protein-ligand pair. - In benchmarking with PoseBuster, RapidDock achieves the best performance among open-source codes (noting that AlphaFold 3 is not open source), particularly in t

Weaknesses

- The code is not shared. - Although the method section claims equivariance, it lacks sufficient explanation on this aspect. - The rationale for using ligand atom charges is not adequately clarified. - It is unclear why non-fixed distances in the molecule's rigid distance matrix are assigned a value of -1. - The annotations for distance bias matrices are insufficiently explained; the annotations appear to be included simply because they work, without detailing why they are effective. - Similarly

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsClick Chemistry and Applications · Advanced Biosensing Techniques and Applications · Biotin and Related Studies

MethodsSoftmax · Attention Is All You Need · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings