3DMolFormer: A Dual-channel Framework for Structure-based Drug Discovery

Xiuyuan Hu; Guoqing Liu; Can Chen; Yang Zhao; Hao Zhang; Xue Liu

arXiv:2502.05107·cs.CE·February 10, 2025

3DMolFormer: A Dual-channel Framework for Structure-based Drug Discovery

Xiuyuan Hu, Guoqing Liu, Can Chen, Yang Zhao, Hao Zhang, Xue Liu

PDF

Open Access 1 Repo 1 Video 3 Reviews

TL;DR

3DMolFormer is a unified transformer-based framework that effectively addresses both protein-ligand docking and 3D drug design by leveraging their duality, utilizing novel data representations, and employing large-scale pre-training.

Contribution

It introduces a dual-channel transformer model for both tasks, overcoming 3D modeling challenges and data limitations through innovative representations and pre-training strategies.

Findings

01

Outperforms previous methods in docking accuracy

02

Effective in pocket-aware 3D drug design

03

Demonstrates strong potential for structure-based drug discovery

Abstract

Structure-based drug discovery, encompassing the tasks of protein-ligand docking and pocket-aware 3D drug design, represents a core challenge in drug discovery. However, no existing work can deal with both tasks to effectively leverage the duality between them, and current methods for each task are hindered by challenges in modeling 3D information and the limitations of available data. To address these issues, we propose 3DMolFormer, a unified dual-channel transformer-based framework applicable to both docking and 3D drug design tasks, which exploits their duality by utilizing docking functionalities within the drug design process. Specifically, we represent 3D pocket-ligand complexes using parallel sequences of discrete tokens and continuous numbers, and we design a corresponding dual-channel transformer model to handle this format, thereby overcoming the challenges of 3D information…

Peer Reviews

Decision·ICLR 2025 Poster

Reviewer 01Rating 8Confidence 4

Strengths

- The proposed GPT framework seems interesting and presents novelty in terms of representing 3D complexes. - 3DMolFormer presents strong results compared to other models on both fine-tuning tasks. - The presentation of this paper is clear and well-structured.

Weaknesses

- The multi-objective optimization of the RL seems to be overly simplistic, including a reward function that assigns a constraint-based reward for QED and SA. Literature on multi-obj DRL shows using more sophisticated reward functions and multi-objective optimization techniques greatly improve agent performance and stability. - Minimal ablation studies are conducted, and all results are based on one run. More runs should be conducted to demonstrate the soundness of the model. Minor edits: - L

Reviewer 02Rating 6Confidence 3

Strengths

1. This paper introduce a novel transformer-based model that can handle docking and structure-based drug design simultaneously

Weaknesses

1. This paper mention figure 1 multiple times when introducing model structure, however there is no figure 1 in the preprint.

Reviewer 03Rating 6Confidence 3

Strengths

1. Docking and structure-based drug design (SBDD) are indeed dual tasks. The method presented in this paper, which models both tasks simultaneously within a single framework, represents a promising and logical approach. 2. By leveraging the similar architecture of GPT, the proposed method demonstrates significant scalability, including both model parameters and data volume, allowing for the effective utilization of large-scale datasets for pre-training.

Weaknesses

As discussed in Section 5, the proposed method does not consider SE(3) symmetry explicitly but instead relies on data augmentation techniques. I think this aspect warrants further discussion and consideration. Although the experiments validate the method's effectiveness to some extent, I believe the persuasive power of these findings is limited when considering the following points: 1. For docking task, as far as I know, the more advanced approach Uimol-docking v2 is not included in the baslines

Code & Models

Repositories

hxyfighter/3dmolformer
pytorchOfficial

Videos

3DMolFormer: A Dual-channel Framework for Structure-based Drug Discovery· slideslive

Taxonomy

TopicsComputational Drug Discovery Methods · bioluminescence and chemiluminescence research · Nanotechnology research and applications