Bridging the Gap between Learning and Inference for Diffusion-Based Molecule Generation

Peidong Liu; Wenbo Zhang; Wei Ju; Jiancheng Lv; Xianggen Liu

arXiv:2411.05472·cs.LG·April 17, 2026

Bridging the Gap between Learning and Inference for Diffusion-Based Molecule Generation

Peidong Liu, Wenbo Zhang, Wei Ju, Jiancheng Lv, Xianggen Liu

PDF

1 Repo

TL;DR

DiffGap is a diffusion-based molecule generation framework that improves the alignment between training and inference, leading to better drug-like molecule generation and binding affinity predictions.

Contribution

It introduces adaptive sampling and pseudo-molecule estimation to bridge the gap between training objectives and inference dynamics in 3D molecule generation.

Findings

01

Outperforms existing methods in docking scores and binding affinity on CrossDocked2020.

02

Enables stable learning of the data distribution through temperature annealing.

03

Generates high-fidelity, drug-like molecules with improved structural and activity properties.

Abstract

The paradigm shift toward structure-driven molecule generation has been propelled by advances in deep generative models, such as variational auto-encoders and diffusion models. However, these generative models for molecular design remain constrained by exposure bias, error accumulation, and suboptimal handling of activity cliffs. Here, we introduce DiffGap, a diffusion-based framework that integrates adaptive sampling and pseudo-molecule estimation to bridge the gap between training objectives and inference dynamics in 3D molecule generation. By dynamically aligning intermediate denoising steps with realistic generation trajectories, DiffGap enables the diffusion model to adapt to input biases in advance during the training phase. A temperature annealing module further controls the aligning strength of the adaptive alignment process, ensuring stable learning of the data distribution.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

neusymlab/DiffGap
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.