MolDA: Molecular Understanding and Generation via Large Language Diffusion Model

Seohyeon Shin; HanJun Choi; Jun-Hyung Park; Hong Kook Kim; and Mansu Kim

arXiv:2604.04403·cs.AI·April 8, 2026

MolDA: Molecular Understanding and Generation via Large Language Diffusion Model

Seohyeon Shin, HanJun Choi, Jun-Hyung Park, Hong Kook Kim, and Mansu Kim

PDF

TL;DR

MolDA introduces a diffusion-based multimodal framework for molecular understanding and generation, overcoming autoregressive limitations to improve structural validity and global coherence.

Contribution

It replaces traditional autoregressive models with a diffusion approach, integrating graph encoders and a Q-Former for enhanced molecular reasoning and generation.

Findings

01

Ensures global structural coherence during molecule generation

02

Achieves higher chemical validity in generated molecules

03

Supports molecule captioning and property prediction

Abstract

Large Language Models (LLMs) have significantly advanced molecular discovery, but existing multimodal molecular architectures fundamentally rely on autoregressive (AR) backbones. This strict left-to-right inductive bias is sub-optimal for generating chemically valid molecules, as it struggles to account for non-local global constraints (e.g., ring closures) and often accumulates structural errors during sequential generation. To address these limitations, we propose MolDA (Molecular language model with masked Diffusion with mAsking), a novel multimodal framework that replaces the conventional AR backbone with a discrete Large Language Diffusion Model. MolDA extracts comprehensive structural representations using a hybrid graph encoder, which captures both local and global topologies, and aligns them into the language token space via a Q-Former. Furthermore, we mathematically reformulate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.