Controllable Molecular Generative Foundation Models

Yihan Zhu; Yuhan Liu; Weijiang Li; Tengfei Luo; Meng Jiang

arXiv:2605.15354·cs.LG·May 18, 2026

Controllable Molecular Generative Foundation Models

Yihan Zhu, Yuhan Liu, Weijiang Li, Tengfei Luo, Meng Jiang

PDF

TL;DR

CoMole introduces a motif-aware graph diffusion framework for controllable molecular generation, effectively integrating pretrained priors and reinforcement learning to optimize chemically meaningful structures across diverse benchmarks.

Contribution

It presents a unified motif-aware graph diffusion model that enables controllable molecular generation with reinforcement learning, outperforming existing methods in validity and controllability.

Findings

01

CoMole ranks first in controllability across all benchmarks.

02

Reduces MAE by up to 48.2% compared to baselines.

03

Maintains validity above 0.94 without post-hoc filtering.

Abstract

Despite the success of foundation models in language and vision, molecular graph generation still lacks a unified framework for heterogeneous design tasks with reliable controllability. While reinforcement learning (RL) offers a natural post-training mechanism for task-specific optimization, applying it to graph generative models is hindered by the vast atom-wise action spaces and chemically invalid intermediate states. We propose \textbf{Co}ntrollable \textbf{Mole}cular Generative Foundation Models (CoMole), built with a unified motif-aware graph diffusion pipeline. By learning a motif-aware graph space, CoMole transfers pretrained structural priors into controllable generation, where RL optimizes conditional reverse policies over chemically meaningful decisions. We theoretically characterize the bottleneck of atom-level RL and justify motif-aware policy optimization. Across three…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.