Generative Flows on Synthetic Pathway for Drug Design
Seonghwan Seo, Minsu Kim, Tony Shen, Martin Ester, Jinkyoo Park,, Sungsoo Ahn, Woo Youn Kim

TL;DR
RxnFlow is a novel generative model for drug design that constructs molecules via chemical reactions, optimizing for diversity and synthesizability, and outperforming existing models in pocket-specific and pocket-conditional tasks.
Contribution
The paper introduces RxnFlow, a reaction-based generative flow network that efficiently explores large action spaces and incorporates synthesizability into drug molecule generation.
Findings
Outperforms existing models in pocket-specific optimization.
Achieves state-of-the-art results on CrossDocked2020.
Generates molecules with high synthesizability and diversity.
Abstract
Generative models in drug discovery have recently gained attention as efficient alternatives to brute-force virtual screening. However, most existing models do not account for synthesizability, limiting their practical use in real-world scenarios. In this paper, we propose RxnFlow, which sequentially assembles molecules using predefined molecular building blocks and chemical reaction templates to constrain the synthetic chemical pathway. We then train on this sequential generating process with the objective of generative flow networks (GFlowNets) to generate both highly rewarded and diverse molecules. To mitigate the large action space of synthetic pathways in GFlowNets, we implement a novel action space subsampling method. This enables RxnFlow to learn generative flows over extensive action spaces comprising combinations of 1.2 million building blocks and 71 reaction templates without…
Peer Reviews
Decision·ICLR 2025 Poster
- RxnFlow performs well across all tasks/targets and the comparison of a reaction-based GFlowNet to SBDD models is welcome. - The subspace sampling technique substantially reduces memory and computational complexity. - The use of non-hierarchical action base combining reactions and build blocks is novel and well motivated. - Article is generally very well written and the quality of figures is high.
**Weakness/points to be worked on:** - The main contribution of the works is the non-hierarchical and continuous action space. There are many theoretical benefits to this, but the benefit of their method is not concretely controlled with respect to the building block data used and compute budgets. - I would recommend Figure 6 be amended with scaling laws for the SynFlowNet [1], RGFN [2] and SyntheMol [3] methods. - Another fundamental claim is that the model can generalise to ‘’unseen” buil
1. The motivation of the paper is clear. I believe that apart from the generative approaches that formulate the problem as a constraint generation/projection, the proposed methods focus on an alternative perspective, i,e, explicitly limiting the action space of gflownets, which should also be explored. 2. The paper introduces a simple yet effective approach, which is referred to as subspace sampling. The method takes a very simple formulation while it enables the reduction of complexity and en
1. Though I appreciate the methods with simplicity and effectiveness, I believe that a more systematic investigation and overview of the proposed methods is needed. Based on the bias/variance discussion in the appendix, does the proposed approach conduct a variance/efficiency tradeoff, i.e. large variance for high efficiency? This is a little counterintuitive for me, could the authors discuss this further. From the ablation in Fig. 6, with sufficient steps, a larger subspace shows better perfor
1. The method enables the generation of synthetic pathways for molecules, allowing for the sampling of highly synthesizable compounds while maintaining a significant level of diversity, which is meaningful for drug discovery. 2. With the enhancement of building blocks, the method demonstrates good scalability. 3. The experiments were conducted thoroughly, and the presentation is relatively clear.
1. Regarding the pocket-conditional generation task, to my knowledge, the more advanced methods [1] have not been compared. This has somewhat affected the persuasive power of the experiments. 2. Compared to other SBDD methods, it seems that direct generation of conformations combined with the pocket is not achievable. 3. Regarding the pocket-specific optimization task, I notice that the reward function consits of Vina Score which is also used for evaluation. I have concerns about whether the me
Code & Models
Videos
Taxonomy
TopicsInnovative Microfluidic and Catalytic Techniques Innovation
MethodsSoftmax · Attention Is All You Need
