MolEditRL: Structure-Preserving Molecular Editing via Discrete Diffusion and Reinforcement Learning
Yuanxin Zhuang, Dazhong Shen, Ying Sun

TL;DR
MolEditRL is a novel framework for molecular editing that combines discrete graph diffusion models with reinforcement learning to improve structural fidelity and property optimization, outperforming existing methods.
Contribution
It introduces MolEditRL, integrating graph diffusion and reinforcement learning for structure-preserving molecular editing, and constructs the large MolEdit-Instruct dataset for comprehensive evaluation.
Findings
74% improvement in editing success rate
98% fewer parameters needed
Outperforms state-of-the-art in property and structural accuracy
Abstract
Molecular editing aims to modify a given molecule to optimize desired chemical properties while preserving structural similarity. However, current approaches typically rely on string-based or continuous representations, which fail to adequately capture the discrete, graph-structured nature of molecules, resulting in limited structural fidelity and poor controllability. In this paper, we propose MolEditRL, a molecular editing framework that explicitly integrates structural constraints with precise property optimization. Specifically, MolEditRL consists of two stages: (1) a discrete graph diffusion model pretrained to reconstruct target molecules conditioned on source structures and natural language instructions; (2) an editing-aware reinforcement learning fine-tuning stage that further enhances property alignment and structural preservation by explicitly optimizing editing decisions…
Peer Reviews
Decision·ICLR 2026 Poster
- Clear writing and strong presentation. The paper is well-structured with clearly outlined motivation, method, and evaluation. - Technical soundness. The integration of discrete diffusion with RL fine-tuning is implemented carefully and supported by strong experimental performance. - Comprehensive evaluation. Extensive experiments across single and multi-property editing tasks, with multiple structure-based similarity metrics and FCD, provide convincing evidence. - Strong empirical results. The
- Missing efficiency analysis. The paper highlights parameter efficiency but does not report oracle-query efficiency compared to baselines. Quantitative analysis of efficiency would strengthen practical claims. - Limited novelty in the diffusion–RL combination. Similar hybrid ideas have appeared in related areas. The primary contribution lies in adapting within the molecular editing context rather than fundamentally extending these techniques, making the innovation more technical than conceptual
1. The problem of controlled, structure-aware molecular editing is highly relevant to drug discovery. The paper’s core approach of combining a discrete graph diffusion model with a full-trajectory RL fine-tuning process is compelling. By operating on graph representations, it sidesteps the syntactic instability and representational ambiguity issues inherent in string-based methods, representing a step towards more reliable molecular design. 2. The construction of the MolEdit-Instruct dataset is
1. The method for encoding the natural language instruction is not sufficiently detailed in the main text. Section 3.1 states that instruction tokens, source atoms, and target atoms are embedded and concatenated into a unified sequence. However, it is unclear how these instruction tokens are specifically embedded and integrated. 2. A well-known failure mode for generative models fine-tuned with RL for molecular optimization is "oracle hacking," where the model learns to generate molecules that a
- The proposed structure-aware attention mechanism offers a potential solution to the complex problem of conditioning the molecular generation with structural constraints. - The joint encoding of both instruction text and molecular structure achieves good alignment between text description and the molecular space. - The performance of the proposed framework significantly outperforms all baselines by several tasks and metrics. Also, detailed visualizations and interpretations of the generative
- The RL setting relies on ready-to-use and fast property calculations, so generalizability to rare or unseen properties is limited (especially for protein binding). - Some specific questions on methods and clarity. See Questions
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInnovative Microfluidic and Catalytic Techniques Innovation · Nanofabrication and Lithography Techniques · Monoclonal and Polyclonal Antibodies Research
MethodsDiffusion
